Intracellular signaling molecules

ABSTRACT

Various embodiments of the invention provide human intracellular signaling molecules (INTSIG) and polynucleotides which identify and encode INTSIG. Embodiments of the invention also provide expression vectors, host cells, antibodies, agonists, and antagonists. Other embodiments provide methods for diagnosing, treating, or preventing disorders associated with aberrant expression of INTSIG.

TECHNICAL FIELD

[0001] This invention relates to nucleic acid and amino acid sequences of intracellular signaling molecules and to the use of these sequences in the diagnosis, treatment, and prevention of cell proliferative, autoimmune/inflammatory, neurological, gastrointestinal, reproductive, cardiovascular, developmental, and vesicle trafficking disorders, and in the assessment of the effects of exogenous compounds on the expression of nucleic acid and amino acid sequences of intracellular signaling molecules.

BACKGROUND OF THE INVENTION

[0002] Cell-cell communication is essential for the growth, development, and survival of multicellular organisms. Cells communicate by sending and receiving molecular signals. An example of a molecular signal is a growth factor, which binds and activates a specific transmembrane receptor on the surface of a target cell. The activated receptor transduces the signal intracellularly, thus initiating a cascade of biochemical reactions that ultimately affect gene transcription and cell cycle progression in the target cell.

[0003] Intracellular signaling is the process by which cells respond to extracellular signals (hormones, neurotransmitters, growth and differentiation factors, etc.) through a cascade of biochemical reactions that begins with the binding of a signaling molecule to a cell membrane receptor and ends with the activation of an intracellular target molecule. Intermediate steps in the process involve the activation of various cytoplasmic proteins by phosphorylation via protein kinases, and their deactivation by protein phosphatases, and the eventual translocation of some of these activated proteins to the cell nucleus where the transcription of specific genes is triggered. The intracellular signaling process regulates all types of cell functions including cell proliferation, cell differentiation, and gene transcription, and involves a diversity of molecules including protein kinases and phosphatases, and second messenger molecules such as cyclic nucleotides, calcium-calmodulin, inositol, and various mitogens that regulate protein phosphorylation.

[0004] Cells also respond to changing conditions by switching off signals. Many signal transduction proteins are short-lived and rapidly targeted for degradation by covalent ligation to ubiquitin, a highly conserved small protein. Cells also maintain mechanisms to monitor changes in the concentration of denatured or unfolded proteins in membrane-bound extracytoplasmic compartments, including a transmembrane receptor that monitors the concentration of available chaperone molecules in the endoplasmic reticulum and transmits a signal to the cytosol to activate the transcription of nuclear genes encoding chaperones in the endoplasmic reticulum.

[0005] Certain proteins in intracellular signaling pathways serve to link or cluster other proteins involved in the signaling cascade. These proteins are referred to as scaffold, anchoring, or adaptor proteins (reviewed in Pawson, T. and J. D. Scott (1997) Science 278:2075-2080). As many intracellular signaling proteins such as protein kinases and phosphatases have relatively broad substrate specificities, the adaptors help to organize the component signaling proteins into specific biochemical pathways. Many of the above signaling molecules are characterized by the presence of particular domains that promote protein-protein interactions. A sampling of these domains is discussed below, along with other important intracellular messengers.

[0006] Intracellular Signaling Second Messenger Molecules

[0007] Protein Phosphorylation

[0008] Protein kinases and phosphatases play a key role in the intracellular signaling process by controlling the phosphorylation and activation of various signaling proteins. The high energy phosphate for this reaction is generally transferred from the adenosine triphosphate molecule (ATP) to a particular protein by a protein kinase and removed from that protein by a protein phosphatase. Protein kinases are roughly divided into two groups: those that phosphorylate serine or threonine residues (serine/threonine kinases, STK) and those that phosphorylate tyrosine residues (protein tyrosine kinases, PTK). A few protein kinases have dual specificity for serine/threonine and tyrosine residues. Almost all kinases contain a conserved 250-300 amino acid catalytic domain containing specific residues and sequence motifs characteristic of the kinase family (Hardie, G. and S. Hanks (1995) The Protein Kinase Facts Books, Vol I:7-20, Academic Press, San Diego Calif.).

[0009] STKs include the second messenger dependent protein kinases such as the cyclic-AMP dependent protein kinases (PKA), involved in mediating hormone-induced cellular responses; calcium-calmodulin (CaM) dependent protein kinases, involved in regulation of smooth muscle contraction, glycogen breakdown, and neurotransmission; and the mitogen-activated protein kinases (MAP kinases) which mediate signal transduction from the cell surface to the nucleus via phosphorylation cascades. Altered PKA expression is implicated in a variety of disorders and diseases including cancer, thyroid disorders, diabetes, atherosclerosis, and cardiovascular disease (Isselbacher, K. J. et al. (1994) Harrison's Principles of Internal Medicine, McGraw-Hill, New York N.Y., pp. 416-431,1887).

[0010] PTKs are divided into transmembrane, receptor PTKs and nontransmembrane, non-receptor PTKs. Transmembrane PTKs are receptors for most growth factors. Non-receptor PTKs lack transmembrane regions and, instead, form complexes with the intracellular regions of cell surface receptors. Receptors that function through non-receptor PM include those for cytokines and hormones (growth hormone and prolactin) and antigen-specific receptors on T and B lymphocytes. Many of these PTKs were first identified as the products of mutant oncogenes in cancer cells in which their activation was no longer subject to normal cellular controls. In fact, about one third of the known oncogenes encode PTKs, and it is well known that cellular transformation (oncogenesis) is often accompanied by increased tyrosine phosphorylation activity (Charbonneau H. and N. K. Tonks (1992) Annu. Rev. Cell Biol. 8:463-493).

[0011] An additional family of protein kinases previously thought to exist only in prokaryotes is the histidine protein kinase family (HPK). HPKs bear little homology with mammalian STKs or PTKs but have distinctive sequence motifs of their own (Davie, J. R. et al. (1995) J. Biol. Chem. 270:19861-19867). A histidine residue in the N-terminal half of the molecule (region I) is an autophosphorylation site. Three additional motifs located in the C-terminal half of the molecule include an invariant asparagine residue in region II and two glycine-rich loops characteristic of nucleotide binding domains in regions m and IV. Recently a branched chain alpha-ketoacid dehydrogenase kinase has been found with characteristics of HPK in rat (Davie et al., supra).

[0012] Protein phosphatases regulate the effects of protein kinases by removing phosphate groups from molecules previously activated by kinases. The two principal categories of protein phosphatases are the protein (serine/threonine) phosphatases (PPs) and the protein tyrosine phosphatases (PTs). PPs dephosphorylate phosphoserine/threonine residues and are important regulators of many cAMP-mediated hormone responses (Cohen, P. (1989) Annu. Rev. Biochem. 58:453-508). PTPs reverse the effects of protein tyrosine kinases and play a significant role in cell cycle and cell signaling processes (Charbonneau and Tonks, supra). As previously noted, many PTKs are encoded by oncogenes, and oncogenesis is often accompanied by increased tyrosine phosphorylation activity. It is therefore possible that PTPs may prevent or reverse cell transformation and the growth of various cancers by controlling the levels of tyrosine phosphorylation in cells. This hypothesis is supported by studies showing that overexpression of PTPs can suppress transformation in cells, and that specific inhibition of PTPs can enhance cell transformation (Charbonneau and Tonks, supra).

[0013] Phospholipid and Inositol-Phosphate Signaling

[0014] Inositol phospholipids (phosphoinositides) are involved in an intracellular signaling pathway that begins with binding of a signaling molecule to a G-protein linked receptor in the plasma membrane. This leads to the phosphorylation of phosphatidylinositol (PI) residues on the inner side of the plasma membrane to the biphosphate state (PIP₂) by inositol kinases. Simultaneously, the G-protein linked receptor binding stimulates a trimeric G-protein which in turn activates a phosphoinositide-specific phospholipase C-β. Phospholipase C-β then cleaves PIP₂ into two products, inositol triphosphate (IP₃) and diacylglycerol. These two products act as mediators for separate signaling events. IP₃ diffuses through the plasma membrane to induce calcium release from the endoplasmic reticulum (ER), while diacylglycerol remains in the membrane and helps activate protein kinase C, a serine-threonine kinase that phosphorylates selected proteins in the target cell. The calcium response initiated by IP₃ is terminated by the dephosphorylation of IP₃ by specific inositol phosphatases. Cellular responses that are mediated by this pathway are glycogen breakdown in the liver in response to vasopressin, smooth muscle contraction in response to acetylcholine, and thrombin-induced platelet aggregation.

[0015] Inositol-phosphate signaling controls tubby, a membrane bound transcriptional regulator that serves as an intracellular messenger of Gα_(q)-coupled receptors (Santagata et al. (2001) Science 292:2041-2050). Members of the tubby family contain a C-terminal tubby domain of about 260 amino acids that binds to double-stranded DNA and an N-terminal transcriptional activation domain. Tubby binds to phosphatidylinositol 4,5-bisphosphate, which localizes tubby to the plasma membrane. Activation of the G-protein aq leads to activation of phospholipase C—P and hydrolysis of phosphoinositide. Loss of phosphatidylinositol 4,5-bisphosphate causes tubby to dissociate from the plasma membrane and to translocate to the nucleus where tubby regulates transcription of its target genes. Defects in the tubby gene are associated with obesity, retinal degeneration, and hearing loss (Boggon, T. J. et al. (1999) Science 286:2119-2125).

[0016] Cyclic Nucleotide Signaling

[0017] Cyclic nucleotides (cAMP and cGMP) function as intracellular second messengers to transduce a variety of extracellular signals including hormones, light, and neurotransmitters. In particular, cyclic-AMP dependent protein kinases (PKA) are thought to account for all of the effects of cAMP in most mammalian cells, including various hormone-induced cellular responses. Visual excitation and the phototransmission of light signals in the eye is controlled by cyclic-GMP regulated, Ca²⁺-specific channels. Because of the importance of cellular levels of cyclic nucleotides in mediating these various responses, regulating the synthesis and breakdown of cyclic nucleotides is an important matter. Thus adenylyl cyclase, which synthesizes cAMP from AMP, is activated to increase cAMP levels in muscle by binding of adrenaline to β-adrenergic receptors, while activation of guanylate cyclase and increased cGMP levels in photoreceptors leads to reopening of the Ca²⁺-specific channels and recovery of the dark state in the eye. There are nine known transmembrane isoforms of mammalian adenylyl cyclase, as well as a soluble form preferentially expressed in testis. Soluble adenylyl cyclase contains a P-loop, or nucleotide binding domain, and may be involved in male fertility (Buck, J. et al. (1999) Proc. Natl. Acad. Sci. USA 96:79-84).

[0018] In contrast, hydrolysis of cyclic nucleotides by cAMP and cGMP-specific phosphodiesterases (PDEs) produces the opposite of these and other effects mediated by increased cyclic nucleotide levels. PDEs appear to be particularly important in the regulation of cyclic nucleotides, considering the diversity found in this family of proteins. At least seven families of mammalian PDEs (PDE1-7) have been identified based on substrate specificity and affinity, sensitivity to cofactors, and sensitivity to inhibitory drugs (Beavo, J. A. (1995) Physiol. Rev. 75:725-748). PDE inhibitors have been found to be particularly useful in treating various clinical disorders. Rolipram, a specific inhibitor of PDE4, has been used in the treatment of depression, and similar inhibitors are undergoing evaluation as anti-inflammatory agents. Theophylline is a nonspecific PDE inhibitor used in the treatment of bronchial asthma and other respiratory diseases (Banner, K. H. and C. P. Page (1995) Eur. Respir. J. 8:996-1000).

[0019] Calcium Signaling Molecules

[0020] In nearly all eukaryotic cells, calcium (Ca²⁺) functions as an intracellular signaling molecule in diverse cellular processes including cell proliferation, neurotransmitter secretion, glycogen metabolism, and muscle contraction. Within a resting cell, the cytosolic concentration of Ca²⁺ is less than 10⁻⁷ M. However, when the cell is stimulated by an external signal, such as a neural impulse or a growth factor, the cytosolic concentration of Ca²⁺ increases by about 50-fold. This influx of Ca²⁺ is prompted by the opening of plasma membrane Ca²⁺ channels and by the release of Ca²⁺ from intracellular stores such as the endoplasmic reticulum. Ca²⁺ directly activates regulatory enzymes, such as protein kinase C, which trigger signal transduction pathways. Ca²⁺ also binds to specific Ca²⁺ binding proteins which then activate multiple target proteins including enzymes, membrane transport pumps, and ion channels.

[0021] Changes in cytosolic calcium ion concentrations ([Ca²⁺]_(i)) evoke a wide range of cellular responses. Intracellular Ca²⁺-binding proteins are the key molecules in transducing Ca²⁺ signaling via enzymatic reactions or modulation of protein-protein interactions, some of which contribute to cell cycle events, and/or to cellular differentiation. Following stimulation of the cell by an external signal, second messenger molecules such as inositol trisphosphate stimulate the brief release of [Ca²⁺ ]_(i) from the endoplasmic reticulum into the surrounding cytoplasm. Similar second messenger signaling pathways also occur in the dividing cell nucleus during breakdown of the nuclear membrane and segregation of chromatids during anaphase.

[0022] The calcium-binding domain of many proteins contains the high affinity Ca²⁺-binding motif often referred to as the EF-hand. The EF-hand is characterized by a twelve amino acid residue-containing loop, flanked by two x-helices oriented approximately 90° with respect to one another. Aspartate (D), and glutamate ( ) or aspartate residues are usually found at positions 10 and 21, respectively, bordering the twelve amino acid loop. In addition, a conserved glycine residue in the central portion of the loop is found in most Ca²⁺-binding EF-hand domains. Oxygen ligands within this domain coordinate the Ca²⁺ ion. Acidic residues are generally conserved at positions 1, 3, 5, 7, 9, and 12. At least 250 EF-hand proteins defining 39 families have been identified in various tissues. Other non-EF-hand domain, Ca²⁺-binding proteins (CBPs) bind Ca²⁺ using different protein conformations (reviewed in Celio, M. P, et al. (1996) Guidebook to the Calcium-Binding Proteins, Oxford University Press, New York N.Y., pp. 15-20).

[0023] Calmodulin (CaM) is the most widely distributed and the most common mediator of calcium effects (Celio et al., supra, pp. 34-40). CaM appears to be the primary sensor of [Ca²⁺], changes in eukaryotic cells. The binding of Ca²⁺ to CaM induces marked conformational changes in the protein permitting interaction with, and regulation of over 100 different proteins. CaM interactions are involved in a multitude of cellular processes including, but not limited to, gene regulation, DNA synthesis, cell cycle progression, mitosis, cytokinesis, cytoskeletal organization, muscle contraction, signal transduction, ion homeostasis, exocytosis, and metabolic regulation. Nuclear calmodulin-binding proteins include chURP (chicken U-related protein, related to heterogeneous nuclear ribonuclear protein U) (Lodge, A. P. et al. (1999) Eur. J. Biochem. 261:137-147).

[0024] CaM contains two pairs of EF-hand domains which are located in the N- and C-terminal halves of the molecule and connected by a flexible central helix. Binding of Ca²⁺ to the EF-hand domains of CaM induces a conformational change in the protein. In the presence of a target peptide, a further conformational change results in the flexible central helix being partially unwound and wrapped around the target peptide. In this manner, CaM interacts with a wide variety of target proteins. Several post-translational modifications of CaM including acylation of the amino terminus and phosphorylation of various serine and threonine residues have been reported.

[0025] The regulation of CBPs has implications for the control of a variety of disorders. Calcineurin, a CaM-regulated protein phosphatase, is a target for inhibition by the immunosuppressive agents cyclosporin and FK506. This indicates the importance of calcineurin and CaM in the immune response and immune disorders (Schwaninger, M. et al. (1993) J. Biol. Chem. 268:23111-23115). The level of CaM is increased several-fold in tumors and tumor-derived cell lines for various types of cancer (Rasmussen, C. D. and A. R Means (1989) Trends in Neuroscience 12:433438).

[0026] Calcineurin homologous protein (CHP) and p22 are homologous CBPs which contain BP-hand motifs and show extensive protein sequence similarity to the regulatory subunit of protein phosphatase 2B, calcineurin B (Lin, X. and D. L. Barber (1996) Proc. Natl. Acad. Sci. USA 93:12631-12636; Barroso, M. R et al. (1996) J. Biol. Chem. 271:10183-10187). CHP is widely expressed inhuman tissues. It specifically binds to and regulates the activity of NHE1, a ubiquitously expressed Na⁺/H⁺ exchanger. Activation of NHE1 results in an increase in intracellular pH, which in turn activates cell proliferation, differentiation, and neoplastic transformation. The phosphorylation state of CHP is important for NHE1 regulation during cell division, and transient overexpression of CHP inhibits serum- and GTPase-stimulated NHE1 activities (Lin and Barber, supra). p22 is a cytosolic N-myristoylated phosphoprotein which undergoes conformational changes upon binding of calcium. p22 is ubiquitously expressed and may be required for regulating constitutive endocytic, membrane trafficking events (Barroso et al., supra).

[0027] Reticulocalbin (RCN) is a member of the EF-hand Ca(2+)-binding protein family and is a luminal protein of the endoplasmic reticulum (ER). RCN has six repeats of a domain containing an EF-hand motif. Ca²⁺ induces a conformational change in reticulocalbin (Tachikui, H. et al. (1997) J. Biochem. (Tokyo) 121:145-149).

[0028] The S100 proteins are a group of acidic Ca²⁺-binding proteins with mass of approximately 10-12 kDa. These proteins are so named after the solubility of the first isolated protein in 100% saturated ammonium sulfate. The S100 proteins have two Ca²⁺-binding domains. One domain is a low affinity Ca²⁺-binding, basic helix-loop-helix site, the other domain is a high affinity Ca²⁺-binding EF-hand type, acidic helix-loop-helix site (Kligman, D. and D. C. Hilt (1988) Trends Biochem. Sci. 13:437-442). The EF-hand domain also encompasses a part of a region that specifically identifies members of the S100 family of proteins, but does not predict the Ca²⁺-binding properties of the region. (See, e.g., SWISSPROT PROSITE pattern, Accession Number PS00303.) The distribution of particular S100 proteins is dependent on specific cell types, indicating that S100 proteins may be involved in transducing signals of increasing intracellular calcium in a cell type-specific fashion. For example, S100A13 protein is present inhuman and murine heart and skeletal muscle, and many other members of the S100 protein family, e.g., S1000, are abundant in brain (Wu, T. et al. (1997) J. Biol. Chem. 272:17145-17153).

[0029] S100α is a member of the S100 protein family isolated from human heart (Celio et al., supra, pp. 135-136; Engelkamp, D. et al. (1992) Biochemistry 31:10258-10264). S100× messenger RNA expression is restricted to the heart, skeletal muscle, and brain. Normally, S100α is undetectable in serum; however, S100α is detectable in serum of patients with renal carcinoma at levels correlated with clinical progression of the disease. In addition, S100α levels are increased in serum following myocardial infarction (Kretsinger, R. H. and C. E. Nockolds (1973) J. Biol. Chem. 248:3313-3326; Celio et al., supra, pp. 15-20).

[0030] Elevated serum levels of S100β are associated with disseminated malignant melanoma metastases, suggesting that serum S10013 may be of value as a clinical marker for progression of metastatic melanoma (Henze, G. et al. (1997) Dermatology 194:208-212). Messenger RNA levels encoding both an S-100-like protein named calgizzarin and phospholipase A₂ are elevated in colorectal cancers compared with those of normal colorectal mucosa (Tanaka, M. et al. (1995) Cancer Lett 89:195-200).

[0031] Two CBPs associated with metaplasia and neoplasia are osteonectin and recoverin. Osteonectin is an anti-adhesive secreted glycoprotein involved in tissue remodeling and has one EF-hand, and a protein-protein or protein-heparin interaction domain. Recoverin was identified as an antigen in cancer-associated retinopathy, and is implicated in the pathway from retinal rod guanylate cyclase to rhodopsin. Recoverin is N-myristoylated at the N-terminus, and has three Ca²⁺-binding sites including one low affinity Ca²⁺-binding site and one high affinity Ca²⁺-binding site (Hohenester, E. et al. (1997) EMBO J. 16:3778-3786; Murakami, A. et al (1992) Biochem. Biophys. Res. Comm. 187:234-244).

[0032] Other Ca²⁺ binding proteins include those involved in Ca²⁺ sequestration and protein folding in the endoplasmic reticulum (ER). For example, Ca²⁺ binding proteins 1 and 2 (CaBP1 and CaBP2) are protein disulfide isomerases that contain two and three thioredoxin-like active sites, respectively (Fullekrug, J. et al. (1994) J. Cell Sci. (1994) 107:2719-2727; Van, P. N. et al. (1993) Eur. J. Biochem. 213:789-795). Each active site contains two invariant cysteines which directly participate in reversible oxidation/reduction reactions. CaBP1 and CaBP2 may facilitate the folding of nascent proteins in the ER by catalyzing the formation and isomerization of disulfide bonds. Like all resident soluble ER proteins, CaBP1 and CaBP2 each contain the C-terminal KDEL/KEEL tetrapeptide required for their retention in the ER.

[0033] Ca²⁺ can enter the cytosol by two pathways, in response to extracellular signals. One pathway acts primarily in nerve signal transduction where Ca²⁺ enters a nerve terminal through a voltage-gated Ca²⁺ channel. Regulation of intracellular Ca²⁺ levels by Ca²⁺ binding proteins is critical for neural signaling and muscle contraction. In particular, elevated levels of intracellular calcium are a major cause of cardiovascular dysfunction, including myocardial ischemia leading to acute infarction. Current therapies for the treatment of myocardial ischemia block the influx of calcium into myocardial and vascular smooth muscle cells.

[0034] The annexins are a family of calcium-binding proteins that associate with the cell membrane (Towle, C. A. and B. V. Treadwell (1992) J. Biol. Chem. 267:5416-5423). Annexins reversibly bind to negatively charged phospholipids (phosphatidylcholine and phosphatidylserine) in a calcium dependent manner. Annexins participate in various processes pertaining to signal transduction at the plasma membrane, including membrane-cytoskeleton interactions, phospholipase inhibition, anticoagulation, and membrane fusion. Annexins contain four to eight repeated segments of about 60 residues. Each repeat folds into five alpha helices wound into a right-handed superhelix.

[0035] G-Protein Signaling

[0036] Guanine nucleotide binding proteins (G-proteins) are critical mediators of signal transduction between a particular class of extracellular receptors, the G-protein coupled receptors (GPCRs), and intracellular second messengers such as cAMP and Ca²⁺. G-proteins are linked to the cytosolic side of a GPCR such that activation of the GPCR by ligand binding stimulates binding of the G-protein to GTP, inducing an “active” state in the G-protein. In the active state, the G-protein acts as a signal to trigger other events in the cell such as the increase of cAMP levels or the release of Ca²⁺ into the cytosol from the ER, which, in turn, regulate phosphorylation and activation of other intracellular proteins. Recycling of the G-protein to the inactive state involves hydrolysis of the bound GTP to GDP by a GTPase activity in the G-protein (Alberts, B. et al. (1994) Molecular Biology of the Cell Garland Publishing, Inc. New York N.Y., pp.734-759). The superfamily of G-proteins consists of several families which may be grouped as translational factors, heterotrimeric G-proteins involved in transmembrane signaling processes, and low molecular weight (LMW) G-proteins including the proto-oncogene Ras proteins and products of rab, rap, rho, rac, smg21, smg25, YPT, SEC4, and ARF genes, and tubulins (Kaziro, Y. et al. (1991) Annu. Rev. Biochem. 60:349-400). In all cases, the GTPase activity is regulated through interactions with other proteins.

[0037] Heterotrimeric G-proteins are composed of 3 subunits, α, β, and γ, which in their inactive conformation associate as a trimer at the inner face of the plasma membrane. Gα binds GDP or GTP and contains the GTPase activity. The βγ complex enhances binding of Gα to a receptor. Gγ is necessary for the folding and activity of Gβ (Neer, E. J. et al. (1994) Nature 371:297-300). Multiple homologs of each subunit have been identified in mammalian tissues, and different combinations of subunits have specific functions and tissue specificities (Spiegel, A. M. (1997) J. Inher. Metab. Dis. 20:113-121).

[0038] The alpha subunits of heterotrimeric G-proteins can be divided into four distinct classes. The α-s class is sensitive to ADP-ribosylation by pertussis toxin which uncouples the receptor:G-protein interaction. This uncoupling blocks signal transduction to receptors that decrease cAMP levels which normally regulate ion channels and activate phospholipases. The inhibitory α-I class is also susceptible to modification by pertussis toxin which prevents α-I from lowering cAMP levels. Two novel classes of α subunits refractory to pertussis toxin modification are α-q, which activates phospholipase C, and α-12, which has sequence homology with the Drosophila gene concertina and may contribute to the regulation of embryonic development (Simon, M. I. (1991) Science 252:802-808).

[0039] The mammalian Gβ and Gγ subunits, each about 340 amino acids long, share more than 80% homology. The Gβ subunit (also called transducin) contains seven repeating units, each about 43 amino acids long. The activity of both subunits may be regulated by other proteins such as calmodulin and phosducin or the neural protein GAP 43 (Clapham, D. and E. Neer (1993) Nature 365:403406). The β and γ subunits are tightly associated. The β subunit sequences are highly conserved between species, implying that they perform a fundamentally important role in the organization and function of G-protein linked systems (Van der Voorn, L. (1992) FEBS Lett. 307:131-134). They contain seven tandem repeats of the WD-repeat sequence motif, a motif found in many proteins with regulatory functions. WD-repeat proteins contain from four to eight copies of a loosely conserved repeat of approximately 40 amino acids which participates in protein-protein interactions. Mutations and variant expression of β transducin proteins are linked with various disorders. Mutations in LIS1, a subunit of the human platelet activating factor acetylhydrolase, cause Miller-Dieker lissencephaly. RACK1 binds activated protein kinase C, and RbAp48 binds retinoblastoma protein. CstF is required for polyadenylation of mammalian pre-mRNA in vitro and associates with subunits of cleavage-stimulating factor. Defects in the regulation of β-catenin contribute to the neoplastic transformation of human cells. The WD40 repeats of the human F-box protein bTrCP mediate binding to β-catenin, thus regulating the targeted degradation of β-catenin by ubiquitin ligase (Neer et al., supra; Hart, M. et al. (1999) Curr. Biol. 9:207-210). The γ subunit primary structures are more variable than those of the β subunits. They are often post-translationally modified by isoprenylation and carboxyl-methylation of a cysteine residue four amino acids from the C-terminus; this appears to be necessary for the interaction of the βγ subunit with the membrane and with other G-proteins. The βγ subunit has been shown to modulate the activity of isoforms of adenylyl cyclase, phospholipase C, and some ion channels. It is involved in receptor phosphorylation via specific linases, and has been implicated in the p21ras-dependent activation of the MAP kinase cascade and the recognition of specific receptors by G-proteins (Clapham and Neer, supra).

[0040] G-proteins interact with a variety of effectors including adenylyl cyclase (Clapham and Neer, supra). The signaling pathway mediated by cAMP is mitogenic in hormone-dependent endocrine tissues such as adrenal cortex, thyroid, ovary, pituitary, and testes. Cancers in these tissues have been related to a mutationally activated form of a Gα, known as the gsp (Gs protein) oncogene (Dhanasekaran, N. et al. (1998) Oncogene 17:1383-1394). Another effector is phosducin, a retinal phosphoprotein, which forms a specific complex with retinal Gβ and Gγ (Gβγ) and modulates the ability of Gβγ to interact with retinal Gα (Clapham and Neer, supra).

[0041] Irregularities in the G-protein signaling cascade may result in abnormal activation of leukocytes and lymphocytes, leading to the tissue damage and destruction seen in many inflammatory and autoimmune diseases such as rheumatoid arthritis, biliary cirrhosis, hemolytic anemia, lupus erythematosus, and thyroiditis. Abnormal cell proliferation, including cyclic AMP stimulation of brain, thyroid, adrenal, and gonadal tissue proliferation is regulated by G proteins. Mutations in Gα subunits have been found in growth-hormone-secreting pituitary somatotroph tumors, hyperfunctioning thyroid adenomas, and ovarian and adrenal neoplasms (Meij, J. T. A. (1996) Mol. Cell. Biochem. 157:31-38; Aussel, C. et al. (1988) J. Immunol 140:215-220).

[0042] LMW G-proteins are GTPases which regulate cell growth, cell cycle control, protein secretion, and intracellular vesicle interaction. They consist of single polypeptides which, like the alpha subunit of the heterotrimeric G-proteins, are able to bind to and hydrolyze GTP, thus cycling between an inactive and an active state. LMW G-proteins respond to extracellular signals from receptors and activating proteins by transducing mitogenic signals involved in various cell functions. The binding and hydrolysis of GTP regulates the response of LMW G-proteins and acts as an energy source during this process (Bokoch, G. M. and C. J. Der (1993) FASEB J. 7:750-759).

[0043] At least sixty members of the LMW G-protein superfamily have been identified and are currently grouped into the ras, rho, arf, sari, ran, and rab subfamilies. Activated ras genes were initially found in human cancers, and subsequent studies confirmed that ras function is critical in determining whether cells continue to grow or become differentiated. Ras1 and Ras2 proteins stimulate adenylate cyclase (Kaziro et al., supra), affecting a broad array of cellular processes. Stimulation of cell surface receptors activates Ras which, in turn, activates cytoplasmic kinases. These kinases translocate to the nucleus and activate key transcription factors that control gene expression and protein synthesis (Barbacid, M. (1987) Annu. Rev. Biochem. 56:779-827; Treisman, R. (1994) Curr. Opin. Genet. Dev. 4:96-98). Other members of the LMW G-protein superfamily have roles in signal transduction that vary with the function of the activated genes and the locations of the G-proteins that initiate the activity. Rho G-proteins control signal transduction pathways that link growth factor receptors to actin polymerization, which is necessary for normal cellular growth and division. The rab, arf, and sar1 families of proteins control the translocation of vesicles to and from membranes for protein processing, localization, and secretion. Vesicle- and target-specific identifiers (v-SNAREs and t-SNAREs) bind to each other and dock the vesicle to the acceptor membrane. The budding process is regulated by the closely related ADP ribosylation factors (ARFs) and SAR proteins, while rab proteins allow assembly of SNARE complexes and may play a role in removal of defective complexes (Rothman, J. and F. Wieland (1996) Science 272:227-234). Ran G-proteins are located in the nucleus of cells and have a key role in nuclear protein import, the control of DNA synthesis, and cell-cycle progression (Hall, A. (1990) Science 249:635-640; Barbacid, supra; Ktistakis, N. (1998) BioEssays 20:495-504; and Sasaki, T. and Y. Takai (1998) Biochem. Biophys. Res. Commun. 245:641-645).

[0044] Low molecular weight GTP-binding proteins play critical roles in cellular protein trafficking events, such as the translocation of proteins and soluble complexes from the cytosol to the membrane through an exchange of GDP for GTP (Ktistakis, N. T. (1998) BioEssays 20:495-504). In vesicle transport, the interaction between vesicle- and target-specific identifiers (v-SNAREs and tSNAREs) docks the vesicle to the acceptor membrane. The budding process is regulated by. GTPases such as the closely related ADP ribosylation factors (ARFs) and SAR proteins, while GTPases such as Rab allow assembly of SNARE complexes and may play a role in removal of defective complexes (Rothman, J. E. and F. T. Wieland (1996) Science 272:227-234). The rab proteins control the translocation of vesicles to and from membranes for protein localization, protein processing, and secretion. Rab proteins have a highly variable amino terminus containing membrane-specific signal information and a prenylated carboxy terminus which determines the target membrane to which the Rab proteins anchor. The rho GTP-binding proteins control signal transduction pathways that link growth factor receptors to actin polymerization which is necessary for normal cellular growth and division. The ran GTP-binding proteins are located in the nucleus of cells and have a key role in nuclear protein import, the control of DNA synthesis, and cell-cycle progression (Hall, A. (1990) Science 249:635-640; Scheffzek, K. et al. (1995) Nature 374:378-381).

[0045] A large family of Ras-like enzymes, the Rab GTPases, play key roles in the endocytic and secretory pathways. The function of Rab proteins in vesicular transport requires the cooperation of many other proteins. Specifically, the membrane-targeting process is assisted by a series of escort proteins (Khosravi-Far, R. et al. (1991) Proc. Natl. Acad. Sci. USA 88:6264-6268). In the medial-Golgi, it has been shown that GTP-bound Rab proteins initiate the binding of VAMP-like proteins of the transport vesicle to syntaxin-like proteins on the acceptor membrane, which subsequently triggers a cascade of protein-binding and membrane-fusion events. After transport, GTPase-activating proteins (GAPs) in the target membrane are responsible for converting the GTP-bound Rab proteins to their GDP-bound state. And finally, guanine-nucleotide dissociation inhibitor (GDI) recruites the GDP-bound proteins to their membrane of origin.

[0046] More than 30 Rab proteins have been identified in a variety of species, and each has a characteristic intracellular location and distinct transport function. In particular, Rab1 and Rab2 are important in ER-to-Golgi transport; Rab3 transports secretory vesicles to the extracellular membrane; Rab5 is localized to endosomes and regulates the fusion of early endosomes into late endosomes; Rab6 is specific to the Golgi apparatus and regulates intra-Golgi transport events; Rab7 and Rab9 stimulate the fusion of late endosomes and Golgi vesicles with lysosomes, respectively; and Rab10 mediates vesicle fusion from the medial Golgi to the trans Golgi. Mutant forms of Rab proteins are able to block protein transport along a given pathway or alter the sizes of entire organelles. Therefore, Rabs play key regulatory roles in membrane trafficking (Schimmöller, I. S. and S. R. Pfeffer (1998) J. Biol. Chem 243:22161-22164).

[0047] The cycling of LMW GTP-binding proteins between the GTP-bound active form and the GDP-bound inactive form is regulated by a variety of proteins. Guanosine nucleotide exchange factors (GEFs) increase the rate of nucleotide dissociation by several orders of magnitude, thus facilitating release of GDP and loading with GTP. The best characterized is the mammalian homolog of the Drosophila Son-of-Sevenless protein. Certain Ras-family proteins are also regulated by guanine nucleotide dissociation inhibitors (GDIs), which inhibit GDP dissociation. The intrinsic rate of GTP hydrolysis of the LMW GTP-binding proteins is typically very slow, but it can be stimulated by several orders of magnitude by GAPs (Geyer, M. and A. Wittinghofer (1997) Curr. Opin. Struct. Biol. 7:786-792). Both GEF and GAP activity may be controlled in response to extracellular stimuli and modulated by accessory proteins such as RaIBP1 and POB1. Mutant Ras-family proteins, which bind but cannot hydrolyze GTP, are permanently activated, and cause cell proliferation or cancer, as do GEFs that inappropriately activate LMW GTP-binding proteins, such as the human oncogene NET1, a Rho-GEF (Drivas, G. T. et al. (1990) Mol. Cell Biol. 10:1793-1798; Alberts, A. S. and R Treisman (1998) EMBO J. 14:4075-4085).

[0048] A member of the ARF family of G-proteins is centaurin beta 1A, a regulator of membrane traffic and the actin cytoskeleton. The centaurin β family of GTPase-activating proteins (GAPs) and Arf guanine nucleotide exchange factors contain pleckstrin homology (PH) domains which are activated by phosphoinositides. PH domains bind phosphoinositides, implicating PH domains in signaling processes. Phosphoinositides have a role in converting Arf-GTP to Arf-GDP via the centaurin β family and a role in Arf activation (Kam, J. L. et al. (2000) J. Biol. Chem. 275:9653-9663). The rho GAP family is also implicated in the regulation of actin polymerization at the plasma membrane and in several cellular processes. The gene ARHGAP6 encodes GTPase-activating protein 6 isoform 4. Mutations in ARHGAP6, seen as a deletion of a 500 kb critical region in Xp22.3, causes the syndrome microphthalmia with linear skin defects (MLS). MLS is an X-linked dominant, male-lethal syndrome (Prakash, S. K et al. (2000) Hum. Mol. Genet 9:477-488).

[0049] A member of the Rho family of G-proteins is CDC42, a regulator of cytoskeletal rearrangements required for cell division. CDC42 is inactivated by a specific GAP (CDC42GAP) that strongly stimulates the GTPase activity of CDC42 while having a much lesser effect on other Rho family members. CDC42GAP also contains an SH3-binding domain that interacts with the SH3 domains of cell signaling proteins such as p85 alpha and c-Src, suggesting that CDC42GAP may serve as a link between CDC42 and other cell signaling pathways (Barfod, E. T. et al (1993) J. Biol. Chem. 268:26059-26062).

[0050] The Db1 proteins are a family of GEFs for the Rho and Ras G-proteins (Whitehead, I. P. et al (1997) Biochim. Biophys. Acta 1332:F1-F23). All Db1 family members contain a Db1 homology (DH) domain of approximately 180 amino acids, as well as a pleckstrin homology (PH) domain located immediately C-terminal to the DH domain. Most Db1 proteins have oncogenic activity, as demonstrated by the ability to transform various cell lines, consistent with roles as regulators of Rho-mediated oncogenic signaling pathways. The kalirin proteins are neuron-specific members of the Db1 family, which are located to distinct subcellular regions of cultured neurons (Johnson, R. C. (2000) J. Cell Biol. 275:19324-19333).

[0051] Other regulators of G-protein signaling (RGS) also exist that act primarily by negatively regulating the G-protein pathway by an unknown mechanism (Druey, K. M. et al. (1996) Nature 379:742-746). Some 15 members of the RGS family have been identified. RGS family members are related structurally through similarities in an approximately 120 amino acid region termed the RGS domain and functionally by their ability to inhibit the interleukin (cytokine) induction of MAP kinase in cultured mammalian 293T cells (Druey et al., supra).

[0052] The Immuno-associated nucleotide (IAN) family of proteins has GTP-binding activity as indicated by the conserved ATP/GTP-binding site P-loop motif. The IAN family includes IAN-1, IAN-4, IAP38, and IAG-1. IAN-1 is expressed in the immune system, specifically in T cells and thymocytes. Its expression is induced during thymic events (Poirier, G. M. C. et al (1999) J. Immunol. 163:4960-4969). IAP38 is expressed in B cells and macrophages and its expression is induced in splenocytes by pathogens. IAG-1, which is a plant molecule, is induced upon bacterial infection (Krucken, J. et al (1997) Biochem. Biophys. Res. Commun. 230:167-170). IAN-4 is a mitochondrial membrane protein which is preferentially expressed in hematopoietic precursor 32D cells transfected with wild-type versus mutant forms of the bcr/ab1 oncogene. The bcr/ab1 oncogene is known to be associated with chronic myelogenous leukemia, a clonal myelo-proliferative disorder, which is due to the translocation between the bcr gene on chromosome 22 and the ab1 gene on chromosome 9. Bcr is the breakpoint cluster region gene and ab1 is the cellular homolog of the transforming gene of the Abelson murine leukemia virus. Therefore, the LAN family of proteins appears to play a role in cell survival in immune responses and cellular transformation (Daheron, L. et al. (2001) Nucleic Acids Res. 29:1308-1316).

[0053] Formin-related genes (FRL) comprise a large family of morphotegulatory genes and have been shown to play important roles in morphogenesis, embryogenesis, cell polarity, cell migration, and cytokinesis through their interaction with Rho family small GTPases. Formin was first identified in mouse limb deformity (Id) mutants where the distal bones and digits of all limbs are fused and reduced in size. FRL contains formin homology domains FH1, F112, and FH3. The PH1 domain has been shown to bind the Src homology 3 (SH3) domain, WWP/WW domains, and profilin. The FH2 domain is conserved and was shown to be essential for formin function as disruption at the FH2 domain results in the characteristic Id phenotype. The FH3 domain is located at the N-terminus of FRL, and is required for associating with Rac, a Rho family GTPase (Yayoshi-Yamamoto, S. et al. (2000) Mol. Cell. Biol. 20:6872-6881).

[0054] Signaling Complex Protein Domains

[0055] PDZ domains were named for three proteins in which this domain was initially discovered. These proteins include PSD-95 (postsynaptic density 95), D1g (Drosophila lethal (1) discs large-1), and ZO-1 (zonula occludens-1). These proteins play important roles in neuronal synaptic transmission, tumor suppression, and cell junction formation, respectively. Since the discovery of these proteins, over sixty additional PDZ-containing proteins have been identified in diverse prokaryotic and eukaryotic organisms. This domain has been implicated in receptor and ion channel clustering and in the targeting of multiprotein signaling complexes to specialized functional regions of the cytosolic face of the plasma membrane. (For a review of PDZ domain-containing proteins, see Ponting, C. P. et al. (1997) Bioessays 19:469-479.) A large proportion of PDZ domains are found in the eukaryotic MAGUK (membrane-associated guanylate kinase) protein family, members of which bind to the intracellular domains of receptors and channels. However, PDZ domains are also found in diverse membrane-localized proteins such as protein tyrosine phosphatases, serine/threonine kinases, G-protein cofactors, and synapse-associated proteins such as syntrophins and neuronal nitric oxide synthase (nNOS). Generally, about one to three PDZ domains are found in a given protein, although up to nine PDZ domains have been identified in a single protein. The glutamate receptor interacting protein (GRIP) contains seven PDZ domains. GRIP is an adaptor that links certain glutamate receptors to other proteins and may be responsible for the clustering of these receptors at excitatory synapses in the brain (Dong, R et al. (1997) Nature 386:279-284). The Drosophila scribble (SCRIB) protein contains both multiple PDZ domains and leucine-rich repeats. SCRIB is located at the epithelial septate junction, which is analogous to the vertebrate tight junction, at the boundary of the apical and basolateral cell surface. SCRIB is involved in the distribution of apical proteins and correct placement of adherens junctions to the basolateral cell surface (Bilder, D. and N. Perrimon (2000) Nature 403:676-680).

[0056] The PX domain is an example of a domain specialized for promoting protein-protein interactions. The PX domain is found in sorting nexins and in a variety of other proteins, including the PhoX components of NADPH oxidase and the Cpk class of phosphatidylinositol 3-kinase. Most PX domains contain a polyproline motif which is characteristic of SH3 domain-binding proteins (Ponting, C. P. (1996) Protein Sci. 5:2353-2357). SH3 domain-mediated interactions involving the PhoX components of NADPH oxidase play a role in the formation of the NADPH oxidase multi-protein complex (Leto, T. L. et al. (1994) Proc. Natl. Acad. Sci. USA 91:10650-10654; Wilson, L. et al. (1997) Inflamm. Res. 46:265-271).

[0057] The SH3 domain is defined by homology to a region of the proto-oncogene c-Src, a cytoplasmic protein tyrosine kinase. 5113 is a small domain of 50 to 60 amino acids that interacts with proline-rich ligands. SH3 domains are found in a variety of eukaryotic proteins involved in signal transduction, cell polarization, and membrane-cytoskeleton interactions. In some cases, SH3 domain-containing proteins interact directly with receptor tyrosine kinases. For example, the SLAP-130 protein is a substrate of the T-cell receptor (TCR) stimulated protein kinase. SLAP-130 interacts via its SH3 domain with the protein SLP-76 to affect the TCR-induced expression of interleukin-2 (Musci, M. A. et al. (1997) J. Biol. Chem. 272:11674-11677). Another recently identified 513 domain protein is macrophage actin-associated tyrosine-phosphorylated protein (MAYP) which is phosphorylated during the response of macrophages to colony stimulating factor-i (CSF-1) and is likely to play a role in regulating the CSF-1-induced reorganization of the actin cytoskeleton (Yeung, Y.-G. et al. (1998) J. Biol. Chem. 273:30638-30642). The structure of the 5113 domain is characterized by two antiparallel beta sheets packed against each other at right angles. This packing forms a hydrophobic pocket lined with residues that are highly conserved between different SH3 domains. This pocket makes critical hydrophobic contacts with proline residues in the ligand (Feng, S. et al. (1994) Science 266:1241-1247).

[0058] A novel domain, called the WW domain, resembles the S513 domain in its ability to bind proline-rich ligands. This domain was originally discovered in dystrophin, a cytoskeletal protein with direct involvement in Duchenne muscular dystrophy (Bork, P. and M. Sudol (1994) Trends Biochem. Sci. 19:531-533). WW domains have since been discovered in a variety of intracellular signaling molecules involved in development, cell differentiation, and cell proliferation. The structure of the WW domain is composed of beta strands grouped around four conserved aromatic residues, generally tryptophan.

[0059] Like SH3, the SH2 domain is defined by homology to a region of c-Src. SH2 domains interact directly with phospho-tyrosine residues, thus providing an immediate mechanism for the regulation and transduction of receptor tyrosine kinase-mediated signaling pathways. For example, as many as ten distinct SH2 domains are capable of binding to phosphorylated tyrosine residues in the activated PDGF receptor, thereby providing a highly coordinated and finely tuned response to ligand-mediated receptor activation (Schaffhausen, B. (1995) Biochim. Biophys. Acta. 1242:61-75). The BLNK protein is a linker protein involved in B cell activation, that bridges B cell receptor-associated kinases with SH2 domain effectors that link to various signaling pathways (Fu, C. et al. (1998) Immunity 9:93-103).

[0060] The pleckstrin homology (PH) domain was originally identified in pleckstrin, the predominant substrate for protein kinase C in platelets. Since its discovery, this domain has been identified in over 90 proteins involved in intracellular signaling or cytoskeletal organization. Proteins containing the pleckstrin homology domain include a variety of kinases, phospholipase-C isoforms, guanine nucleotide release factors, and GTPase activating proteins. For example, members of the FGD1 family contain both Rho-guanine nucleotide exchange factor (GEF) and PH domains, as well as a FYVE zinc finger domain. FGD1 is the gene responsible for faciogenital dysplasia, an inherited skeletal dysplasia (Pasteris, N. G. and J. L. Gorski (1999) Genomics 60:57-66). Many PH domain proteins function in association with the plasma membrane, and this association appears to be mediated by the PH domain itself. PH domains share a common structure composed of two antiparallel beta sheets flanked by an amphipathic alpha helix. Variable loops connecting the component beta strands generally occur within a positively charged environment and may function as ligand binding sites (Lemmon, M. A. et al. (1996) Cell 85:621-624).

[0061] Ankyrin (ANK) repeats mediate protein-protein interactions associated with diverse intracellular signaling functions. For example, ANK repeats are found in proteins involved in cell proliferation such as kinases, kinase inhibitors, tumor suppressors, and cell cycle control proteins (Kalus, W. et al. (1997) FEBS Lett 401:127-132; Ferrante, A. W. et al (1995) Proc. Natl. Acad. Sci. USA 92:1911-1915). These proteins generally contain multiple ANK repeats, each composed of about 33 amino acids. Myotrophin is an ANK repeat protein that plays a key role in the development of cardiac hypertrophy, a contributing factor to many heart diseases. Structural studies show that the myotrophin ANK repeats, like other ANK repeats, each form a helix-turn-helix core preceded by a protruding “tip.” These tips are of variable sequence and may play a role in protein-protein interactions. The helix-turn-helix region of the ANK repeats stack on top of one another and are stabilized by hydrophobic interactions (Yang, Y. et al (1998) Structure 6:619-626). Members of the ASB protein family contain a suppressor of cytokine signaling (SOCS) domain as well as multiple ankyrin repeats (Hilton, D. J. et al. (1998) Proc. Natl. Acad. Sci. USA 95:114-119).

[0062] The tetratricopeptide repeat (TPR) is a 34 amino acid repeated motif found in organisms from bacteria to humans. TPRs are predicted to form ampipathic helices, and appear to mediate protein-protein interactions. TPR domains are found in CDC16, CDC23, and CDC27, members of the anaphase promoting complex which targets proteins for degradation at the onset of anaphase. Other processes involving TPR proteins include cell cycle control, transcription repression, stress response, and protein kinase inhibition (Lamb, J. R et al. (1995) Trends Biochem. Sci. 20:257-259).

[0063] The armadillo/beta-catenin repeat is a 42 amino acid motif which forms a superhelix of alpha helices when tandemly repeated. The structure of the armadillo repeat region from beta-catenin revealed a shallow groove of positive charge on one face of the superhelix, which is a potential binding surface. The armadillo repeats of beta-catenin, plakoglobin, and p120^(cas) bind the cytoplasmic domains of cadherins. Beta-catenin/cadherin complexes are targets of regulatory signals that govern cell adhesion and mobility (Huber, A. H. et al. (1997) Cell 90:871-882).

[0064] Eight tandem repeats of about 40 residues (WD-40 repeats), each containing a central Trp-Asp motif, make up beta-transducin (G-beta), which is one of the three subunits (alpha, beta, and gamma) of the guanine nucleotide-binding proteins (G proteins). In higher eukaryotes G-beta exists as a small multigene family of highly conserved proteins of about 340 amino acid residues.

[0065] Signaling by Notch family receptors controls cell fate decisions during development (Frisen, J. and U. Lendahl (2001) Bioessays 23:3-7). The Notch receptor signaling pathway is involved in the morphogenesis and development of many organs and tissues in multicellular species. Notch receptors are large transmembrane proteins that contain extracellular regions made up of repeated EGF domains. Notchless was identified in a screen for molecules that modulate notch activity (Royet, J. et al. (1998) EMBO J. 17:7351-7360). Notchless, which contains nine WD40 repeats, binds to the cytoplasmic domain of Notch and inhibits Notch activity.

[0066] Expression Profiling

[0067] Microarrays are analytical tools used in bioanalysis. A microarray has a plurality of molecules spatially distributed over, and stably associated with, the surface of a solid support Microarrays of polypeptides, polynucleotides, and/or antibodies have been developed and find use in a variety of applications, such as gene sequencing, monitoring gene expression, gene mapping, bacterial identification, drug discovery, and combinatorial chemistry.

[0068] One area in particular in which microarrays find use is in gene expression analysis. Array technology can provide a simple way to explore the expression of a single polymorphic gene or the expression profile of a large number of related or unrelated genes. When the expression of a single gene is examined, arrays are employed to detect the expression of a specific gene or its variants. When an expression profile is examined, arrays provide a platform for identifying genes that are tissue specific, are-affected by a substance being tested in a toxicology assay, are part of a signaling cascade, carry out housekeeping functions, or are specifically related to a particular genetic predisposition, condition, disease, or disorder.

[0069] Breast cancer is the most frequently diagnosed type of cancer in American women and the second most frequent cause of cancer death. The lifetime risk of an American woman developing breast cancer is 1 in 8, and one-third of women diagnosed with breast cancer die of the disease. A number of risk factors have been identified, including hormonal and genetic factors. One genetic defect associated with breast cancer results in a loss of heterozygosity (LOH) at multiple loci such as p53, Rb, BRCA1, and BRCA2. Another genetic defect is gene amplification involving genes such as c-myc and c-erbB2 (Her2-neu gene). Steroid and growth factor pathways are also altered in breast cancer, notably the estrogen, progesterone, and epidermal growth factor (EGF) pathways. Breast cancer evolves through a multi-step process whereby premalignant mammary epithelial cells undergo a relatively defined sequence of events leading to tumor formation. An early event in tumor development is ductal hyperplasia. Cells undergoing rapid neoplastic growth gradually progress to invasive carcinoma and become metastatic to the lung, bone, and potentially other organs. Variables that may influence the process of tumor progression and malignant transformation include genetic factors, environmental factors, growth factors, and hormones.

[0070] As with most tumors, prostate cancer develops through a multistage progression ultimately resulting in an aggressive tumor phenotype. The initial step in tumor progression involves the hyperproliferation of normal luminal and/or basal epithelial cells. Androgen responsive cells become hyperplastic and evolve into early-stage tumors. Although early-stage tumors are often androgen sensitive and respond to androgen ablation, a population of androgen independent cells evolve from the hyperplastic population. These cells represent a more advanced form of prostate tumor that may become invasive and potentially become metastatic to the bone, brain, or lung. A variety of genes may be differentially expressed during tumor progression. For example, loss of heterozygosity (LOH) is frequently observed on chromosome 8p in prostate cancer. Fluorescence in situ hybridization (FISH) revealed a deletion for at least 1 locus on 8p in 29 (69%) tumors, with a significantly higher frequency of the deletion on 8p21.2-p21.1 in advanced prostate cancer than in localized prostate cancer, implying that deletions on 8p22-p21.3 play an important role in tumor differentiation, while 8p21.2-p21.1 deletion plays a role in progression of prostate cancer (Oba, K. et al (2001) Cancer Genet Cytogenet. 124:20-26).

[0071] Steroids are a class of lipid-soluble molecules, including cholesterol bile acids, vitamin D, and hormones, that share a common four-ring structure based on cyclopentanoperhydrophenanthrene and that carry out a wide variety of functions. Cholesterol, for example, is a component of cell membranes that controls membrane fluidity. It is also a precursor for bile acids which solubilize lipids and facilitate absorption in the small intestine during digestion. Vitamin D regulates the absorption of calcium in the small intestine and controls the concentration of calcium in plasma. Steroid hormones, produced by the adrenal cortex, ovaries, and testes, include glucocorticoids, mineralocorticoids, androgens, and estrogens. They control various biological processes by binding to intracellular receptors that regulate transcription of specific genes in the nucleus. Glucocorticoids, for example, increase blood glucose concentrations by regulation of gluconeogenesis in the liver, increase blood concentrations of fatty acids by promoting lipolysis in adipose tissues, modulate sensitivity to catcholamines in the central nervous system, and reduce inflammation. The principal mineralocorticoid, aldosterone, is produced by the adrenal cortex and acts on cells of the distal tubules of the kidney to enhance sodium ion reabsorption. Androgens, produced by the interstitial cells of Leydig in the testis, include the male sex hormone testosterone, which triggers changes at puberty, the production of sperm and maintenance of secondary sexual characteristics. Female sex hormones, estrogen and progesterone, are produced by the ovaries and also by the placenta and adrenal cortex of the fetus during pregnancy. Estrogen regulates female reproductive processes and secondary sexual characteristics. Progesterone regulates changes in the endometrium during the menstrual cycle and pregnancy.

[0072] Steroid hormones are widely used for fertility control and in anti-inflammatory treatments for physical injuries and diseases such as arthritis, asthma, and autoimmune disorders. Progesterone, a naturally occurring progestin, is primarily used to treat amenorrhea, abnormal uterine bleeding, or as a contraceptive. Endogenous progesterone is responsible for inducing secretory activity in the endometrium of the estrogen-primed uterus in preparation for the implantation of a fertilized egg and for the maintenance of pregnancy. It is secreted from the corpus luteum in response to luteining hormone (LH). The primary contraceptive effect of exogenous progestins involves the suppression of the midcycle surge of LH. At the cellular level, progestins diffuse freely into target cells and bind to the progesterone receptor. Target cells include the female reproductive tract, the mammary gland, the hypothalamus, and the pituitary. Once bound to the receptor, progestins slow the frequency of release of gonadotropin releasing hormone from the hypothalamus and blunt the pre-ovulatory LH surge, thereby preventing follicular maturation and ovulation. Progesterone has minimal estrogenic and androgenic activity. Progesterone is metabolized hepatically to pregnanediol and conjugated with glacuronic acid.

[0073] Medroxyprogesterone (MAH), also known as 6α-methyl-17-hydroxyprogesterone, is a synthetic progestin with a pharmacological activity about 15 times greater than progesterone. MAH is used for the treatment of renal and endometrial carcinomas, amenorrhea, abnormal uterine bleeding, and endometriosis associated with hormonal imbalance. MAH has a stimulatory effect on respiratory centers and has been used in cases of low blood oxygenation caused by sleep apnea, chronic obstructive pulmonary disease, or hypercapnia.

[0074] Mifepristone, also known as RU-486, is an antiprogesterone drug that blocks receptors of progesterone. It counteracts the effects of progesterone, which is needed to sustain pregnancy. Mifepristone induces spontaneous abortion when administered in early pregnancy followed by treatment with the prostaglandin misoprostol. Further studies show that mifepristone at a substantially lower dose can be highly effective as a postcoital contraceptive when administered within five days after unprotected intercourse, thus providing women with a “morning-after pill” in case of contraceptive failure or sexual assault. Mifepristone also has potential uses in the treatment of breast and ovarian cancers in cases in which tumors are progesterone-dependent. It interferes with steroid-dependent growth of brain meningiomas, and may be useful in treatment of endometriosis where it blocks the estrogen-dependent growth of endometrial tissues. It may also be useful in treatment of uterine fibroid tumors and Cushing's Syndrome. Mifepristone binds to glucocorticoid receptors and interferes with cortisol binding. Mifepristone also may act as an anti-glucocorticoid and be effective for treating conditions where cortisol levels are elevated such as AIDS, anorexia nervosa, ulcers, diabetes, Parkinson's disease, multiple sclerosis, and Alzheimer's disease.

[0075] Danazol is a synthetic steroid derived from ethinyl testosterone. Danazol indirectly reduces estrogen production by lowering pituitary synthesis of follicle-stimulating hormone and LH. Danazol also binds to sex hormone receptors in target tissues, thereby exhibiting anabolic, antiestrognic, and weakly androgenic activity. Danazol does not possess any progestogenic activity, and does not suppress normal pituitary release of corticotropin or release of cortisol by the adrenal glands. Danazol is used in the treatment of endometriosis to relieve pain and inhibit endometrial cell growth. It is also used to treat fibrocystic breast disease and hereditary angioedema.

[0076] Corticosteroids are used to relieve inflammation and to suppress the immune response. They inhibit eosinophil, basophil, and airway epithelial cell function by regulation of cytokines that mediate the inflammatory response. They inhibit leukocyte infiltration at the site of inflammation, interfere in the function of mediators of the inflammatory response, and suppress the humoral immune response. Corticosteroids are used to treat allergies, asthma, arthritis, and skin conditions. Beclomethasone is a synthetic glucocorticoid that is used to treat steroid-dependent asthma, to relieve symptoms associated with allergic or nonallergic (vasomotor) ribinitis, or to prevent recurrent nasal polyps following surgical removal. The anti-inflammatory and vasoconstrictive effects of intranasal beclomethasone are 5000 times greater than those produced by hydrocortisone. Budesonide is a corticosteroid used to control symptoms associated with allergic rhinitis or asthma. Budesonide has high topical anti-inflammatory activity but low systemic activity. Dexamethasone is a synthetic glucocorticoid used in anti-inflammatory or immunosuppressive compositions. It is also used in inhalants to prevent symptoms of asthma. Due to its greater ability to reach the central nervous system, dexamethasone is usually the treatment of choice to control cerebral edema. Dexamethasone is approximately 20-30 times more potent than hydrocortisone and 5-7 times more potent than prednisone. Prednisone is metabolized in the liver to its active form, prednisolone, a glucocorticoid with anti-inflammatory properties. Prednisone is approximately 4 times more potent than hydrocortisone and the duration of action of prednisone is intermediate between hydrocortisone and dexamethasone. Prednisone is used to treat allograft rejection, asthma, systemic lupus erythematosus, arthritis, ulcerative colitis, and other inflammatory conditions. Betamethasone is a synthetic glucocorticoid with antiinflammatory and immunosuppressive activity and is used to treat psoriasis and fungal infections, such as athlete's foot and ringworm.

[0077] The anti-inflammatory actions of corticosteroids are thought to involve phospholipase A₂ inhibitory proteins, collectively called lipocortins. Lipocortins, in turn, control the biosynthesis of potent mediators of inflammation such as prostaglandins and leukotrienes by inhibiting the release of the precursor molecule arachidonic acid. Proposed mechanisms of action include decreased IgE synthesis, increased number of O-adrenergic receptors on leukocytes, and decreased arachidonic acid metabolism. During an immediate allergic reaction, such as in chronic bronchial asthma, allergens bridge the IgE antibodies on the surface of mast cells, which triggers these cells to release chemotactic substances. Mast cell influx and activation, therefore, is partially responsible for the inflammation and hyperirritability of the oral mucosa in asthmatic patients. This inflammation can be retarded by administration of corticosteroids.

[0078] The effects upon liver metabolism and hormone clearance mechanisms are important to understand the pharmacodynamics of a drug. For example, the human C3A cell line is a clonal derivative of HepG2/C3 (hepatoma cell line, isolated from a 15-year-old male with liver tumor), which was selected for strong contact inhibition of growth. The use of a clonal population enhances the reproducibility of the cells. C3A cells have many characteristics of primary human hepatocytes in culture: i) expression of insulin receptor and insulin-like growth factor II receptor; ii) secretion of a high ratio of serum albumin compared with α-fetoprotein; iii) conversion of ammonia to urea and glutamine; iv) metabolism of aromatic amino acids; and v) proliferation in glucose-free and insulin-free medium. The C3A cell line is now well established as an in vitro model of the mature human liver (Mickelson, J. K. et al. (1995) Hepatology 22:866-875; Nagendra, A. R et al. (1997) Am. J. Physiol 272:G408-G416).

[0079] Alzheimer's disease is associated with defects inhippocampus signal transduction cascades (Dineley, K. T. et al. (2001) J. Neurosci. 21:4125-33). Alzheimer's disease is a progressive neurodegenerative disorder that is characterized by the formation of senile plaques and neurofibrillary tangles containing amyloid beta peptide. These plaques are found in limbic and association cortices of the brain, including hippocampus, temporal cortices, cingulate cortex, amygdala, nucleus basalis and locus caeruleus. Early in Alzheimer's pathology, physiological changes are visible in the cingulate cortex (Minoshima, S. et al. (1997) Annals of Neurology 42:85-94). In subjects with advanced Alzheimer's disease, accumulating plaques damage the neuronal architecture in limbic areas and eventually cripple the memory process.

[0080] There is a need in the art for new compositions, including nucleic acids and proteins, for the diagnosis, prevention, and treatment of cell proliferative, autoimmune/inflammatory, neurological, gastrointestinal, reproductive, cardiovascular, developmental, and vesicle trafficking disorders.

SUMMARY OF THE INVENTION

[0081] Various embodiments of the invention provide purified polypeptides, intracellular signaling molecules, referred to collectively as “INTSIG” and individually as “INTSIG-1,” “INTSIG-2,” “INTSIG-3,” “INTSIG-4,” “INTSIG-5,” “INTSIG-6,” “INTSIG-7,” “INTSIG-8,” “INTSIG-9,” “INTSIG-10,” “INTSIG-11,” “INTSIG-12,” “INTSIG-13,” “INTSIG-14,” “INTSIG-15,” “INTSIG-16,” “INTSIG-17,” “INTSIG-18,” “INTSIG-19,” “INTSIG-20,” “INTSIG-21,” “INTSIG-22,” and “INTSIG-23,” and methods for using these proteins and their encoding polynucleotides for the detection, diagnosis, and treatment of diseases and medical conditions. Embodiments also provide methods for utilizing the purified intracellular signaling molecules and/or their encoding polynucleotides for facilitating the drug discovery process, including determination of efficacy, dosage, toxicity, and pharmacology. Related embodiments provide methods for utilizing the purified intracellular signaling molecules and/or their encoding polynucleotides for investigating the pathogenesis of diseases and medical conditions.

[0082] An embodiment provides an isolated polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23. Another embodiment provides an isolated polypeptide comprising an amino acid sequence of SEQ ID NO:1-23.

[0083] Still another embodiment provides an isolated polynucleotide encoding a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 1-23, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23. In another embodiment, the polynucleotide encodes a polypeptide selected from the group consisting of SEQ ID NO:1-23. In an alternative embodiment, the polynucleotide is selected from the group consisting of SEQ ID NO:24-46.

[0084] Still another embodiment provides a recombinant polynucleotide comprising a promoter sequence operably linked to a polynucleotide encoding a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23. Another embodiment provides a cell transformed with the recombinant polynucleotide. Yet another embodiment provides a transgenic organism comprising the recombinant polynucleotide.

[0085] Another embodiment provides a method for producing a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23. The method comprises a) culturing a cell under conditions suitable for expression of the polypeptide, wherein said cell is transformed with a recombinant polynucleotide comprising a promoter sequence operably linked to a polynucleotide encoding the polypeptide, and b) recovering the polypeptide so expressed.

[0086] Yet another embodiment provides an isolated antibody which specifically binds to a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23.

[0087] Still yet another embodiment provides an isolated polynucleotide selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide complementary to the polynucleotide of b), and e) an RNA equivalent of a)-d). In other embodiments, the polynucleotide can comprise at least about 20, 30, 40, 60, 80, or 100 contiguous nucleotides.

[0088] Yet another embodiment provides a method for detecting a target polynucleotide in a sample, said target polynucleotide being selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:2446, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide complementary to the polynucleotide of b), and e) an RNA equivalent of a)-d). The method comprises a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides comprising a sequence complementary to said target polynucleotide in the sample, and which probe specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization complex is formed between said probe and said target polynucleotide or fragments thereof, and b) detecting the presence or absence of said hybridization complex. In a related embodiment, the method can include detecting the amount of the hybridization complex. In still other embodiments, the probe can comprise at least about 20, 30, 40, 60, 80, or 100 contiguous nucleotides.

[0089] Still yet another embodiment provides a method for detecting a target polynucleotide in a sample, said target polynucleotide being selected from the group consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide complementary to the polynucleotide of b), and e) an RNA equivalent of a)-d). The method comprises a) amplifying said target polynucleotide or fragment thereof using polymerase chain reaction amplification, and b) detecting the presence or absence of said amplified target polynucleotide or fragment thereof. In a related embodiment, the method can include detecting the amount of the amplified target polynucleotide or fragment thereof.

[0090] Another embodiment provides a composition comprising an effective amount of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, and a pharmaceutically acceptable excipient. In one embodiment, the composition can comprise an amino acid sequence selected from the group consisting of SEQ ID NO:1-23. Other embodiments provide a method of treating a disease or condition associated with decreased or abnormal expression of functional INTSIG, comprising administering to a patient in need of such treatment the composition.

[0091] Yet another embodiment provides a method for screening a compound for effectiveness as an agonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23. The method comprises a) exposing a sample comprising the polypeptide to a compound, and b) detecting agonist activity in the sample. Another embodiment provides a composition comprising an agonist compound identified by the method and a pharmaceutically acceptable excipient. Yet another embodiment provides a method of treating a disease or condition associated with decreased expression of functional INTSIG, comprising administering to a patient in need of such treatment the composition.

[0092] Still yet another embodiment provides a method for screening a compound for effectiveness as an antagonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-23, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23. The method comprises a) exposing a sample comprising the polypeptide to a compound, and b) detecting antagonist activity in the sample. Another embodiment provides a composition comprising an antagonist compound identified by the method and a pharmaceutically acceptable excipient. Yet another embodiment provides a method of treating a disease or condition associated with overexpression of functional INTSIG, comprising administering to a patient in need of such treatment the composition.

[0093] Another embodiment provides a method of screening for a compound that specifically binds to a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-23. The method comprises a) combining the polypeptide with at least one test compound under suitable conditions, and b) detecting binding of the polypeptide to the test compound, thereby identifying a compound that specifically binds to the polypeptide.

[0094] Yet another embodiment provides a method of screening for a compound that modulates the activity of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, c) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23. The method comprises a) combining the polypeptide with at least one test compound under conditions permissive for the activity of the polypeptide, b) assessing the activity of the polypeptide in the presence of the test compound, and c) comparing the activity of the polypeptide in the presence of the test compound with the activity of the polypeptide in the absence of the test compound, wherein a change in the activity of the polypeptide in the presence of the test compound is indicative of a compound that modulates the activity of the polypeptide.

[0095] Still yet another embodiment provides a method for screening a compound for effectiveness in altering expression of a target polynucleotide, wherein said target polynucleotide comprises a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, the method comprising a) exposing a sample comprising the target polynucleotide to a compound, b) detecting altered expression of the target polynucleotide, and c) comparing the expression of the target polynucleotide in the presence of varying amounts of the compound and in the absence of the compound.

[0096] Another embodiment provides a method for assessing toxicity of a test compound, said method comprising a) treating a biological sample containing nucleic acids with the test compound; b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 contiguous nucleotides of a polynucleotide selected from the group consisting of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, iii) a polynucleotide having a sequence complementary to i), iv) a polynucleotide complementary to the polynucleotide of ii), and v) an RNA equivalent of i)-iv). Hybridization occurs under conditions whereby a specific hybridization complex is formed between said probe and a target polynucleotide in the biological sample, said target polynucleotide selected from the group consisting of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, iii) a polynucleotide complementary to the polynucleotide of i), iv) a polynucleotide complementary to the polynucleotide of ii), and v) an RNA equivalent of i)-iv). Alternatively, the target polynucleotide can comprise a fragment of a polynucleotide selected from the group consisting of i)-v) above; c) quantifying the amount of hybridization complex; and d) comparing the amount of hybridization complex in the treated biological sample with the amount of hybridization complex in an untreated biological sample, wherein a difference in the amount of hybridization complex in the treated biological sample is indicative of toxicity of the test compound.

BRIEF DESCRIPTION OF THE TABLES

[0097] Table 1 summarizes the nomenclature for full length polynucleotide and polypeptide embodiments of the invention.

[0098] Table 2 shows the GenBank identification number and annotation of the nearest GenBank homolog for polypeptide embodiments of the invention. The probability scores for the matches between each polypeptide and its homolog(s) are also shown.

[0099] Table 3 shows structural features of polypeptide embodiments, including predicted motifs and domains, along with the methods, algorithms, and searchable databases used for analysis of the polypeptides.

[0100] Table 4 lists the cDNA and/or genomic DNA fragments which were used to assemble polynucleotide embodiments, along with selected fragments of the polynucleotides.

[0101] Table 5 shows representative cDNA libraries for polynucleotide embodiments.

[0102] Table 6 provides an appendix which describes the tissues and vectors used for construction of the cDNA libraries shown in Table 5.

[0103] Table 7 shows the tools, programs, and algorithms used to analyze polynucleotides and polypeptides, along with applicable descriptions, references, and threshold parameters.

DESCRIPTION OF THE INVENTION

[0104] Before the present proteins, nucleic acids, and methods are described, it is understood that embodiments of the invention are not limited to the particular machines, instruments, materials, and methods described, as these may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the invention.

[0105] As used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to “a host cell” includes a plurality of such host cells, and a reference to “an antibody” is a reference to one or more antibodies and equivalents thereof known to those skilled in the art, and so forth.

[0106] Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any machines, materials, and methods similar or equivalent to those described herein can be used to practice or test the present invention, the preferred machines, materials and methods are now described. All publications mentioned herein are cited for the purpose of describing and disclosing the cell lines, protocols, reagents and vectors which are reported in the publications and which might be used in connection with various embodiments of the invention. Nothing herein is to be construed as an admission that the invention is not entitled to antedate such disclosure by virtue of prior invention.

[0107] Definitions “INTSIG” refers to the amino acid sequences of substantially purified INTSIG obtained from any species, particularly a mammalian species, including bovine, ovine, porcine, murine, equine, and human, and from any source, whether natural, synthetic, semi-synthetic, or recombinant.

[0108] The term “agonist” refers to a molecule which intensifies or mimics the biological activity of INTSIG. Agonists may include proteins, nucleic acids, carbohydrates, small molecules, or any other compound or composition which modulates the activity of INTSIG either by directly interacting with INTSIG or by acting on components of the biological pathway in which INTSIG participates.

[0109] An “allelic variant is an alternative form of the gene encoding INTSIG. Allelic variants may result from at least one mutation in the nucleic acid sequence and may result in altered mRNAs or in polypeptides whose structure or function may or may not be altered. A gene may have none, one, or many allelic variants of its naturally occurring form. Common mutational changes which give rise to allelic variants are generally ascribed to natural deletions, additions, or substitutions of nucleotides. Each of these types of changes may occur alone, or in combination with the others, one or more times in a given sequence.

[0110] “Altered” nucleic acid sequences encoding INTSIG include those sequences with deletions, insertions, or substitutions of different nucleotides, resulting in a polypeptide the same as INTSIG or a polypeptide with at least one functional characteristic of INTSIG. Included within this definition are polymorphisms which may or may not be readily detectable using a particular oligonucleotide probe of the polynucleotide encoding INTSIG, and improper or unexpected hybridization to allelic variants, with a locus other than the normal chromosomal locus for the polynucleotide encoding INTSIG. The encoded protein may also be “altered,” and may contain deletions, insertions, or substitutions of amino acid residues which produce a silent change and result in a functionally equivalent INTSIG. Deliberate amino acid substitutions may be made on the basis of one or more similarities in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues, as long as the biological or immunological activity of INTSIG is retained. For example, negatively charged amino acids may include aspartic acid and glutamic acid, and positively charged amino acids may include lysine and arginine. Amino acids with uncharged polar side chains having similar hydrophilicity values may include: asparagine and glutanine; and serine and threonine. Amino acids with uncharged side chains having similar hydrophilicity values may include: leucine, isoleucine, and valine; glycine and alanine; and phenylalanine and tyrosine.

[0111] The terms “amino acid” and “amino acid sequence” can refer to an oligopeptide, a peptide, a polypeptide, or a protein sequence, or a fragment of any of these, and to naturally occurring or synthetic molecules. Where “amino acid sequence” is recited to refer to a sequence of a naturally occurring protein molecule, “amino acid sequence” and like terms are not meant to limit the amino acid sequence to the complete native amino acid sequence associated with the recited protein molecule.

[0112] “Amplification” relates to the production of additional copies of a nucleic acid. Amplification may be carried out using polymerase chain reaction (PCR) technologies or other nucleic acid amplification technologies well known in the art.

[0113] The term “antagonist” refers to a molecule which inhibits or attenuates the biological activity of INTSIG. Antagonists may include proteins such as antibodies, anticalins, nucleic acids, carbohydrates, small molecules, or any other compound or composition which modulates the activity of INTSIG either by directly interacting with INTSIG or by acting on components of the biological pathway in which INTSIG participates.

[0114] The term “antibody” refers to intact immunoglobulin molecules as well as to fragments thereof, such as Fab, F(ab′)₂, and Fv fragments, which are capable of binding an epitopic determinant. Antibodies that bind INTSIG polypeptides can be prepared using intact polypeptides or using fragments containing small peptides of interest as the immunizing antigen. The polypeptide or oligopeptide used to immunize an animal (e.g., a mouse, a rat, or a rabbit) can be derived from the translation of RNA, or synthesized chemically, and can be conjugated to a carrier protein if desired. Commonly used carriers that are chemically coupled to peptides include bovine serum albumin, thyroglobulin, and keyhole limpet hemocyanin (KLH). The coupled peptide is then used to immunize the animal.

[0115] The term “antigenic determinant” refers to that region of a molecule (i.e., an epitope) that makes contact with a particular antibody. When a protein or a fragment of a protein is used to immunize a host animal, numerous regions of the protein may induce the production of antibodies which bind specifically to antigenic determinants (particular regions or three-dimensional structures on the protein). An antigenic determinant may compete with the intact antigen (i.e., the immunogen used to elicit the immune response) for binding to an antibody.

[0116] The term “aptamer” refers to a nucleic acid or oligonucleotide molecule that binds to a specific molecular target. Aptamers are derived from an in vitro evolutionary process (e.g., SELEX (Systematic Evolution of Ligands by EXponential Enrichment), described in U.S. Pat. No. 5,270,163), which selects for target-specific aptamer sequences from large combinatorial libraries. Aptamer compositions may be double-stranded or single-stranded, and may include deoxyribonucleotides, ribonucleotides, nucleotide derivatives, or other nucleotide-like molecules. The nucleotide components of an aptamer may have modified sugar groups (e.g., the 2′-OH group of a ribonucleotide may be replaced by 2′-F or 2′-NH₂), which may improve a desired property, e.g., resistance to nucleases or longer lifetime in blood. Aptamers may be conjugated to other molecules, e.g., a high molecular weight carrier to slow clearance of the aptamer from the circulatory system. Aptamers may be specifically cross-linked to their cognate ligands, e.g., by photo-activation of a cross-linker (Brody, E. N. and L. Gold (2000) J. Biotechnol. 74:5-13).

[0117] The term “intramer” refers to an aptamer which is expressed in vivo. For example, a vaccinia virus-based RNA expression system has been used to express specific RNA aptamers at high levels in the cytoplasm of leukocytes (Blind, M. et al. (1999) Proc. Natl. Acad. Sci. USA 96:3606-3610).

[0118] The term “spiegelmer” refers to an aptamer which includes L-DNA, L-RNA, or other left-handed nucleotide derivatives or nucleotide-like molecules. Aptamers containing left-handed nucleotides are resistant to degradation by naturally occurring enzymes, which normally act on substrates containing right-handed nucleotides.

[0119] The term “antisense” refers to any composition capable of base-pairing with the “sense” (coding) strand of a polynucleotide having a specific nucleic acid sequence. Antisense compositions may include DNA; RNA; peptide nucleic acid (PNA); oligonucleotides having modified backbone linkages such as phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides having modified sugar groups such as 2′-methoxyethyl sugars or 2′-methoxyethoxy sugars; or oligonucleotides having modified bases such as 5-methyl cytosine, 2′-deoxyuracil, or 7-deaza-2′-deoxyguanosine. Antisense molecules may be produced by any method including chemical synthesis or transcription. Once introduced into a cell, the complementary antisense molecule base-pairs with a naturally occurring nucleic acid sequence produced by the cell to form duplexes which block either transcription or translation. The designation “negative” or “minus” can refer to the antisense strand, and the designation “positive” or “plus” can refer to the sense strand of a reference DNA molecule.

[0120] The term “biologically active” refers to a protein having structural, regulatory, or biochemical functions of a naturally occurring molecule. Likewise, “immunologically active” or “immunogenic” refers to the capability of the natural, recombinant, or synthetic INTSIG, or of any oligopeptide thereof, to induce a specific immune response in appropriate animals or cells and to bind with specific antibodies.

[0121] “Complementary” describes the relationship between two single-stranded nucleic acid sequences that anneal by base-pairing. For example, 5′-AGT-3′ pairs with its complement, 3′-TCA-5′.

[0122] A “composition comprising a given polynucleotide” and a “composition comprising a given polypeptide” can refer to any composition containing the given polynucleotide or polypeptide. The composition may comprise a dry formulation or an aqueous solution. Compositions comprising polynucleotides encoding INTSIG or fragments of INTSIG may be employed as hybridization probes. The probes may be stored in freeze-dried form and may be associated with a stabilizing agent such as a carbohydrate. In hybridizations, the probe may be deployed in an aqueous solution containing salts (e.g., NaCl), detergents (e.g., sodium dodecyl sulfate; SDS), and other components (e.g., Denhardt's solution, dry milk, salmon sperm DNA, etc.).

[0123] “Consensus sequence” refers to a nucleic acid sequence which has been subjected to repeated DNA sequence analysis to resolve uncalled bases, extended using the XL-PCR kit (Applied Biosystems, Poster City Calif.) in the 5′ and/or the 3′ direction, and resequenced, or which has been assembled from one or more overlapping cDNA, EST, or genomic DNA fragments using a computer program for fragment assembly, such as the GELVIEW fragment assembly system (GCG, Madison Wis.) or Phrap (University of Washington, Seattle Wash.). Some sequences have been both extended and assembled to produce the consensus sequence.

[0124] “Conservative amino acid substitutions” are those substitutions that are predicted to least interfere with the properties of the original protein, i.e., the structure and especially the function of the protein is conserved and not significantly changed by such substitutions. The table below shows amino acids which may be substituted for an original amino acid in a protein and which are regarded as conservative amino acid substitutions. Original Residue Conservative Substitution Ala Gly, Ser Arg His, Lys Asn Asp, Gln, His Asp Asn, Glu Cys Ala, Ser Gln Asn, Glu, His Glu Asp, Gln, His Gly Ala His Asn, Arg, Gln, Glu Ile Leu, Val Leu Ile, Val Lys Arg, Gln, Glu Met Leu, Ile Phe His, Met, Leu, Trp, Tyr Ser Cys, Thr Thr Ser, Val Trp Phe, Tyr Tyr His, Phe, Trp Val Ile, Leu, Thr

[0125] Conservative amino acid substitutions generally maintain (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a beta sheet or alpha helical conformation, (b) the charge or hydrophobicity of the molecule at the site of the substitution, and/or (c) the bulk of the side chain.

[0126] A “deletion” refers to a change in the amino acid or nucleotide sequence that results in the absence of one or more amino acid residues or nucleotides.

[0127] The term “derivative” refers to a chemically modified polynucleotide or polypeptide. Chemical modifications of a polynucleotide can include, for example, replacement of hydrogen by an alkyl, acyl, hydroxyl, or amino group. A derivative polynucleotide encodes a polypeptide which retains at least one biological or immunological function of the natural molecule. A derivative polypeptide is one modified by glycosylation, pegylation, or any similar process that retains at least one biological or immunological function of the polypeptide from which it was derived.

[0128] A “detectable label” refers to a reporter molecule or enzyme that is capable of generating a measurable signal and is covalently or noncovalently joined to a polynucleotide or polypeptide.

[0129] “Differential expression” refers to increased or upregulated; or decreased, downregulated, or absent gene or protein expression, determined by comparing at least two different samples. Such comparisons may be carried out between, for example, a treated and an untreated sample, or a diseased and a normal sample.

[0130] “Exon shuffling” refers to the recombination of different coding regions (exons). Since an exon may represent a structural or functional domain of the encoded protein, new proteins may be assembled through the novel reassortment of stable substructures, thus allowing acceleration of the evolution of new protein functions.

[0131] A “fragment” is a unique portion of INTSIG or a polynucleotide encoding INTSIG which can be identical in sequence to, but shorter in length than, the parent sequence. A fragment may comprise up to the entire length of the defined sequence, minus one nucleotide/amino acid residue. For example, a fragment may comprise from about 5 to about 1000 contiguous nucleotides or amino acid residues. A fragment used as a probe, primer, antigen, therapeutic molecule, or for other purposes, may be at least 5, 10, 15, 16, 20, 25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500 contiguous nucleotides or amino acid residues in length. Fragments may be preferentially selected from certain regions of a molecule. For example, a polypeptide fragment may comprise a certain length of contiguous amino acids selected from the first 250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a certain defined sequence. Clearly these lengths are exemplary, and any length that is supported by the specification, including the Sequence Listing, tables, and figures, may be encompassed by the present embodiments.

[0132] A fragment of SEQ ID NO:24-46 can comprise a region of unique polynucleotide sequence that specifically identifies SEQ ID NO:24-46, for example, as distinct from any other sequence in the genome from which the fragment was obtained. A fragment of SEQ ID NO:24-46 can be employed in one or more embodiments of methods of the invention, for example, in hybridization and amplification technologies and in analogous methods that distinguish SEQ ID NO:24-46 from related polynucleotides. The precise length of a fragment of SEQ ID NO:24-46 and the region of SEQ ID NO:24-46 to which the fragment corresponds are routinely determinable by one of ordinary skill in the art based on the intended purpose for the fragment.

[0133] A fragment of SEQ ID NO: 1-23 is encoded by a fragment of SEQ ID NO:24-46. A fragment of SEQ ID NO:1-23 can comprise a region of unique amino acid sequence that specifically identifies SEQ ID NO:1-23. For example, a fragment of SEQ ID NO:1-23 can be used as an immunogenic peptide for the development of antibodies that specifically recognize SEQ ID NO:1-23. The precise length of a fragment of SEQ ID NO:1-23 and the region of SEQ ID NO: 1-23 to which the fragment corresponds can be determined based on the intended purpose for the fragment using one or more analytical methods described herein or otherwise known in the art.

[0134] A “full length” polynucleotide is one containing at least a translation initiation codon (e.g., methionine) followed by an open reading frame and a translation termination codon. A “full length” polynucleotide sequence encodes a “full length” polypeptide sequence.

[0135] “Homology” refers to sequence similarity or, interchangeably, sequence identity, between two or more polynucleotide sequences or two or more polypeptide sequences.

[0136] The terms “percent identity” and “% identity,” as applied to polynucleotide sequences, refer to the percentage of residue matches between at least two polynucleotide sequences aligned using a standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps in the sequences being compared in order to opt alignment between two sequences, and therefore achieve a more meaningful comparison of the two sequences.

[0137] Percent identity between polynucleotide sequences may be determined using one or more computer algorithms or programs known in the art or described herein. For example, percent identity can be determined using the default parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3.12e sequence alignment program. This program is part of the LASERGENE software package, a suite of molecular biological analysis programs (DNASTAR, Madison Wis.). CLUSTAL V is described in Higgins, D. G. and P. M. Sharp (1989; CABIOS 5:151-153) and in Higgins, D. G. et al (1992; CABIOS 8:189-191). For pairwise alignments of polynucleotide sequences, the default parameters are set as follows: Ktuple=2, gap penalty=5, window=4, and “diagonals saved”=4. The “weighted” residue weight table is selected as the default. Percent identity is reported by CLUSTAL V as the “percent similarity” between aligned polynucleotide sequences.

[0138] Alternatively, a suite of commonly used and freely available sequence comparison algorithms which can be used is provided by the National Center for Biotechnology Information (NCBI) Basic Local Alignment Search Tool (BLAST) (Altschul, S. F. et al. (1990) J. Mol. Biol. 215:403-410), which is available from several sources, including the NCBI, Bethesda, Md., and on the Internet at http://www.ncbi.nlm.nibgov/BLAST/. The BLAST software suite includes various sequence analysis programs including “blastn,” that is used to align a known polynucleotide sequence with other polynucleotide sequences from a variety of databases. Also available is a tool called “BLAST 2 Sequences” that is used for direct pairwise comparison of two nucleotide sequences. “BLAST 2 Sequences” can be accessed and used interactively at http://www.ncbi.nlm.nih.gov/gorf/bl2.html. The “BLAST 2 Sequences” tool can be used for both blastn and blastp (discussed below). BLAST programs are commonly used with gap and other parameters set to default settings. For example, to compare two nucleotide sequences, one may use blastn with the “BLAST 2 Sequences” tool Version 2.0.12 (April-21-2000) set at default parameters. Such default parameters may be, for example:

[0139] Matrix: BLOSUM62

[0140] Reward for match: 1

[0141] Penalty for mismatch: −2

[0142] Open Gap: 5 and Extension Gap: 2 penalties

[0143] Gap×drop-off: 50

[0144] Expect: 10

[0145] Word Size: 11

[0146] Filter: on

[0147] Percent identity may be measured over the length of an entire defined sequence, for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous nucleotides. Such lengths are exemplary only, and it is understood that any fragment length supported by the sequences shown herein, in the tables, figures, or Sequence listing, may be used to describe a length over which percentage identity may be measured.

[0148] Nucleic acid sequences that do not show a high degree of identity may nevertheless encode similar amino acid sequences due to the degeneracy of the genetic code. It is understood that changes in a nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid sequences that all encode substantially the same protein.

[0149] The phrases “percent identity” and “% identity,” as applied to polypeptide sequences, refer to the percentage of residue matches between at least two polypeptide sequences aligned using a standardized algorithm. Methods of polypeptide sequence alignment are well-known. Some alignment methods take into account conservative amino acid substitutions. Such conservative substitutions, explained in more detail above, generally preserve the charge and hydrophobicity at the site of substitution, thus preserving the structure (and therefore function) of the polypeptide.

[0150] Percent identity between polypeptide sequences may be determined using the default parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3.12e sequence alignment program (described and referenced above). For pairwise alignments of polypeptide sequences using CLUSTAL V, the default parameters are set as follows: Ktuple=1, gap penalty=3, window=5, and “diagonals saved”=5. The PAM250 matrix is selected as the default residue weight table. As with polynucleotide alignments, the percent identity is reported by CLUSTAL V as the “percent similarity” between aligned polypeptide sequence pairs.

[0151] Alternatively the NCBI BLAST software suite may be used. For example, for a pairwise comparison of two polypeptide sequences, one may use the “BLAST 2 Sequences” tool Version 2.0.12 (April-21-2000) with blastp set at default parameters. Such default parameters may be, for example:

[0152] Matrix: BLOSUM62

[0153] Open Gap: 11 and Extension Gap: 1 penalties

[0154] Gap×drop-off. 50

[0155] Expect: 10

[0156] Word Size: 3

[0157] Filter: on.

[0158] Percent identity may be measured over the length of an entire defined polypeptide sequence, for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for example, over the length of a fragment taken from a larger, defined polypeptide sequence, for instance, a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 150 contiguous residues. Such lengths are exemplary only, and it is understood that any fragment length supported by the sequences shown herein, in the tables, figures or Sequence Listing, may be used to describe a length over which percentage identity may be measured. Human artificial chromosomes” (HACs) are linear microchromosomes which may contain DNA sequences of about 6 kb to 10 Mb in size and which contain all of the elements required for chromosome replication, segregation and maintenance.

[0159] The term “humanized antibody” refers to an antibody molecule in which the amino acid sequence in the non-antigen binding regions has been altered so that the antibody more closely resembles a human antibody, and still retains its original binding ability.

[0160] “Hybridization” refers to the process by which a polynucleotide strand anneals with a complementary strand through base pairing under defined hybridization conditions. Specific hybridization is an indication that two nucleic acid sequences share a high degree of complementarity. Specific hybridization complexes form under permissive annealing conditions and remain hybridized after the “washing” step(s). The washing step(s) is particularly important in determining the stringency of the hybridization process, with more stringent conditions allowing less non-specific binding, i.e., binding between pairs of nucleic acid strands that are not perfectly matched. Permissive conditions for annealing of nucleic acid sequences are routinely determinable by one of ordinary skill in the art and may be consistent among hybridization experiments, whereas wash conditions may be varied among experiments to achieve the desired stringency, and therefore hybridization specificity. Permissive annealing conditions occur, for example, at 68° C. in the presence of about 6×SSC, about 1% (w/v) SDS, and about 100 μg/ml sheared, denatured salmon sperm DNA.

[0161] Generally, stringency of hybridization is expressed, in part, with reference to the temperature under which the wash step is carried out. Such wash temperatures are typically selected to be about 5° C. to 20° C. lower than the thermal melting point (T_(m)) for the specific sequence at a defined ionic strength and pH. The T_(m) is the temperature (under defined ionic strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. An equation for calculating T_(m) and conditions for nucleic acid hybridization are well known and can be found in Sambrook, J. et al. (1989) Molecular Cloning: A Laboratory Manual, 2^(nd) ed., vol. 1-3, Cold Spring Harbor Press, Plainview N.Y.; specifically see volume 2, chapter 9.

[0162] High stringency conditions for hybridization between polynucleotides of the present invention include wash conditions of 68° C. in the presence of about 0.2×SSC and about 0.1% SDS, for 1 hour. Alternatively, temperatures of about 65° C., 60° C., 55° C., or 42° C. may be used. SSC concentration may be varied from about 0.1 to 2×SSC, with SDS being present at about 0.1%. Typically, blocking reagents are used to block non-specific hybridization. Such blocking reagents include, for instance, sheared and denatured salmon sperm DNA at about 100-200 μg/ml. Organic solvent, such as formamide at a concentration of about 35-50% v/v, may also be used under particular circumstances, such as for RNA:DNA hybridizations. Useful variations on these wash conditions will be readily apparent to those of ordinary skill in the art. Hybridization, particularly under high stringency conditions, may be suggestive of evolutionary similarity between the nucleotides. Such similarity is strongly indicative of a similar role for the nucleotides and their encoded polypeptides.

[0163] The term “hybridization complex” refers to a complex formed between two nucleic acids by virtue of the formation of hydrogen bonds between complementary bases. A hybridization complex may be formed in solution (e.g., C₀t or R₀t analysis) or formed between one nucleic acid present in solution and another nucleic acid immobilized on a solid support (e.g., paper, membranes, filters, chips, pins or glass slides, or any other appropriate substrate to which cells or their nucleic acids have been fixed).

[0164] The words “insertion” and “addition” refer to changes in an amino acid or polynucleotide sequence resulting in the addition of one or more amino acid residues or nucleotides, respectively.

[0165] “Immune response” can refer to conditions associated with inflammation, trauma, immune disorders, or infectious or genetic disease, etc. These conditions can be characterized by expression of various factors, e.g., cytokines, chemokines, and other signaling molecules, which may affect cellular and systemic defense systems.

[0166] An “immunogenic fragment” is a polypeptide or oligopeptide fragment of INTSIG which is capable of eliciting an immune response when introduced into a living organism, for example, a mammal. The term “immunogenic fragment” also includes any polypeptide or oligopeptide fragment of INTSIG which is useful in any of the antibody production methods disclosed herein or known in the art.

[0167] The term “microarray” refers to an arrangement of a plurality of polynucleotides, polypeptides, antibodies, or other chemical compounds on a substrate.

[0168] The terms “element” and “array element” refer to a polynucleotide, polypeptide, antibody, or other chemical compound having a unique and defined position on a microarray.

[0169] The term “modulate” refers to a change in the activity of INTSIG. For example, modulation may cause an increase or a decrease in protein activity, binding characteristics, or any other biological, functional or immunological properties of INTSIG.

[0170] The phrases “nucleic acid” and “nucleic acid sequence” refer to a nucleotide, oligonucleotide, polynucleotide, or any fragment thereof. These phrases also refer to DNA or RNA of genomic or synthetic origin which may be single-stranded or double-stranded and may represent the sense or the antisense strand, to peptide nucleic acid (PNA), or to any DNA-like or RNA-like material “Operably linked” refers to the situation in which a first nucleic acid sequence is placed in a functional relationship with a second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the promoter affects the transcription or expression of the coding sequence. Operably linked DNA sequences may be in close proximity or contiguous and, where necessary to join two protein coding regions, in the same reading frame.

[0171] “Peptide nucleic acid” (PNA) refers to an antisense molecule or anti-gene agent which comprises an oligonucleotide of at least about 5 nucleotides in length linked to a peptide backbone of amino acid residues ending in lysine. The terminal lysine confers solubility to the composition. PNAs preferentially bind complementary single stranded DNA or RNA and stop transcript elongation, and may be pegylated to extend their lifespan in the cell.

[0172] “Post-translational modification” of an INTSIG may involve lipidation, glycosylation, phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in the art. These processes may occur synthetically or biochemically. Biochemical modifications will vary by cell type depending on the enzymatic milieu of INTSIG.

[0173] “Probe” refers to nucleic acids encoding INTSIG, their complements, or fragments thereof, which are used to detect identical, allelic or related nucleic acids. Probes are isolated oligonucleotides or polynucleotides attached to a detectable label or reporter molecule. Typical labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes. “Primers” are short nucleic acids, usually DNA oligonucleotides, which may be annealed to a target polynucleotide by complementary base-pairing. The primer may then be extended along the target DNA strand by a DNA polymerase enzyme. Primer pairs can be used for amplification (and identification) of a nucleic acid, e.g., by the polymerase chain reaction (PCR).

[0174] Probes and primers as used in the present invention typically comprise at least 15 contiguous nucleotides of a known sequence. In order to enhance specificity, longer probes and primers may also be employed, such as probes and primers that comprise at least 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, or at least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers may be considerably longer than these examples, and it is understood that any length supported by the specification, including the tables, figures, and Sequence Listing, may be used.

[0175] Methods for preparing and using probes and primers are described in the references, for example Sambrook, J. et al. (1989; Molecular Cloning: A Laboratory Manual, 2nd ed., vol. 1-3, Cold Spring Harbor Press, Plainview N.Y.), Ausubel, F. M. et al. (1999) Short Protocols in Molecular Biology, 4^(th) ed., John Wiley & Sons, New York N.Y.), and Innis, M. et al. (1990; PCR Protocols, A Guide to Methods and Applications, Academic Press, San Diego Calif.). PCR primer pairs can be derived from a known sequence, for example, by using computer programs intended for that purpose such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge Mass.).

[0176] Oligonucleotides for use as primers are selected using software known in the art for such purpose. For example, OLIGO 4.06 software is useful for the selection of PCR primer pairs of up to 100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 5,000 nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer selection programs have incorporated additional features for expanded capabilities. For example, the PrimOU primer selection program (available to the public from the Genome Center at University of Texas South West Medical Center, Dallas TX) is capable of choosing specific primers from megabase sequences and is thus useful for designing primers on a genome-wide scope. The Primer3 primer selection program (available to the public from the Whitehead Institute/MIT Center for Genome Research, Cambridge Mass.) allows the user to input a “mispriming library,” in which sequences to avoid as primer binding sites are user-specified. Primer3 is useful, in particular, for the selection of oligonucleotides for microarrays. (The source code for the latter two primer selection programs may also be obtained from their respective sources and modified to meet the user's specific needs.) The PrimeGen program (available to the public from the UK Human Genome Mapping Project Resource Centre, Cambridge UK) designs primers based on multiple sequence alignments, thereby allowing selection of primers that hybridize to either the most conserved or least conserved regions of aligned nucleic acid sequences. Hence, this program is useful for identification of both unique and conserved oligonucleotides and polynucleotide fragments. The oligonucleotides and polynucleotide fragments identified by any of the above selection methods are useful in hybridization technologies, for example, as PCR or sequencing primers, microarray elements, or specific probes to identify fully or partially complementary polynucleotides in a sample of nucleic acids. Methods of oligonucleotide selection are not limited to those described above.

[0177] A “recombinant nucleic acid” is a nucleic acid that is not naturally occurring or has a sequence that is made by an artificial combination of two or more otherwise separated segments of sequence. This artificial combination is often accomplished by chemical synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques such as those described in Sambrook et al., supra. The term recombinant includes nucleic acids that have been altered solely by addition, substitution, or deletion of a portion of the nucleic acid. Frequently, a recombinant nucleic acid may include a nucleic acid sequence operably linked to a promoter sequence. Such a recombinant nucleic acid may be part of a vector that is used, for example, to transform a cell.

[0178] Alternatively, such recombinant nucleic acids may be part of a viral vector, e.g., based on a vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is expressed, inducing a protective immunological response in the mammal.

[0179] A “regulatory element” refers to a nucleic acid sequence usually derived from untranslated regions of a gene and includes enhancers, promoters, introns, and 5′ and 3′ untranslated regions (UTRs). Regulatory elements interact with host or viral proteins which control transcription, translation, or RNA stability.

[0180] “Reporter molecules” are chemical or biochemical moieties used for labeling a nucleic acid, amino acid, or antibody. Reporter molecules include radionuclides; enzymes; fluorescent, chemiluminescent, or chromogenic agents; substrates; cofactors; inhibitors; magnetic particles; and other moieties known in the art.

[0181] An “RNA equivalent,” in reference to a DNA molecule, is composed of the same linear sequence of nucleotides as the reference DNA molecule with the exception that all occurrences of the nitrogenous base thymine are replaced with uracil, and the sugar backbone is composed of ribose instead of deoxyribose.

[0182] The term “sample” is used in its broadest sense. A sample suspected of containing INTSIG, nucleic acids encoding INTSIG, or fragments thereof may comprise a bodily fluid; an extract from a cell, chromosome, organelle, or membrane isolated from a cell; a cell; genomic DNA, RNA, or cDNA, in solution or bound to a substrate; a tissue; a tissue print; etc.

[0183] The terms “specific binding” and “specifically binding” refer to that interaction between a protein or peptide and an agonist, an antibody, an antagonist, a small molecule, or any natural or synthetic binding composition. The interaction is dependent upon the presence of a particular structure of the protein, e.g., the antigenic determinant or epitope, recognized by the binding molecule. For example, if an antibody is specific for epitope “A,” the presence of a polypeptide comprising the epitope A, or the presence of free unlabeled A, in a reaction containing free labeled A and the antibody will reduce the amount of labeled A that binds to the antibody.

[0184] The term “substantially purified” refers to nucleic acid or amino acid sequences that are removed from their natural environment and are isolated or separated, and are at least about 60% free, preferably at least about 75% free, and most preferably at least about 90% free from other components with which they are naturally associated.

[0185] A “substitution” refers to the replacement of one or more amino acid residues or nucleotides by different amino acid residues or nucleotides, respectively.

[0186] “Substrate” refers to any suitable rigid or semi-rigid support including membranes, filters, chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, microparticles and capillaries. The substrate can have a variety of surface forms, such as wells, trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound.

[0187] A “transcript image” or “expression profile” refers to the collective pattern of gene expression by a particular cell type or tissue under given conditions at a given time.

[0188] “Transformation” describes a process by which exogenous DNA is introduced into a recipient cell. Transformation may occur under natural or artificial conditions according to various methods well known in the art, and may rely on any known method for the insertion of foreign nucleic acid sequences into a prokaryotic or eukaryotic host cell. The method for transformation is selected based on the type of host cell being transformed and may include, but is not limited to, bacteriophage or viral infection, electroporation, heat shock, lipofection, and particle bombardment. The term “transformed cells” includes stably transformed cells in which the inserted DNA is capable of replication either as an autonomously replicating plasmid or as part of the host chromosome, as well as transiently transformed cells which express the inserted DNA or RNA for limited periods of time.

[0189] A “transgenic organism,” as used herein, is any organism, including but not limited to animals and plants, in which one or more of the cells of the organism contains heterologous nucleic acid introduced by way of human intervention, such as by transgenic techniques well known in the art. The nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with a recombinant virus. In another embodiment, the nucleic acid can be introduced by infection with a recombinant viral vector, such as a lentiviral vector (Lois, C. et al. (2002) Science 295:868-872). The term genetic manipulation does not include classical cross-breeding, or in vitro fertilization, but rather is directed to the introduction of a recombinant DNA molecule. The transgenic organisms contemplated in accordance with the present invention include bacteria, cyanobacteria, fungi, plants and animals. The isolated DNA of the present invention can be introduced into the host by methods known in the art, for example infection, transfection, transformation or transconjugation. Techniques for transferring the DNA of the present invention into such organisms are widely known and provided in references such as Sambrook et al., supra.

[0190] A “variant” of a particular nucleic acid sequence is defined as a nucleic acid sequence having at least 40% sequence identity to the particular nucleic acid sequence over a certain length of one of the nucleic acid sequences using blastn with the “BLAST 2 Sequences” tool Version 2.0.9 (May-07-1999) set at default parameters. Such a pair of nucleic acids may show, for example, at least 50%, at least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence identity over a certain defined length A variant may be described as, for example, an “allelic” (as defined above), “splice,” “species,” or “polymorphic” variant. A splice variant may have significant identity to a reference molecule, but will generally have a greater or lesser number of polynucleotides due to alternate splicing of exons during mRNA processing. The corresponding polypeptide may possess additional functional domains or lack domains that are present in the reference molecule. Species variants are polynucleotides that vary from one species to another. The resulting polypeptides will generally have significant amino acid identity relative to each other. A polymorphic variant is a variation in the polynucleotide sequence of a particular gene between individuals of a given species. Polymorphic variants also may encompass “single nucleotide polymorphisms” (SNPs) in which the polynucleotide sequence varies by one nucleotide base. The presence of SNPs may be indicative of, for example, a certain population, a disease state, or a propensity for a disease state.

[0191] A “variant” of a particular polypeptide sequence is defined as a polypeptide sequence having at least 40% sequence identity to the particular polypeptide sequence over a certain length of one of the polypeptide sequences using blastp with the “BLAST 2 Sequences” tool Version 2.0.9 (May-07-1999) set at default parameters. Such a pair of polypeptides may show, for example, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence identity over a certain defined length of one of the polypeptides.

[0192] The Invention

[0193] Various embodiments of the invention include new human intracellular signaling molecules (INTSIG), the polynucleotides encoding INTSIG, and the use of these compositions for the diagnosis, treatment, or prevention of cell proliferative, autoimmune/inflammatory, neurological, gastrointestinal, reproductive, cardiovascular, developmental, and vesicle trafficking disorders.

[0194] Table 1 summarizes the nomenclature for the full length polynucleotide and polypeptide embodiments of the invention. Each polynucleotide and its corresponding polypeptide are correlated to a single Incyte project identification number (Incyte Project ID). Each polypeptide sequence is denoted by both a polypeptide sequence identification number (Polypeptide SEQ ID NO:) and an Incyte polypeptide sequence number (Incyte Polypeptide ID) as shown. Bach polynucleotide sequence is denoted by both a polynucleotide sequence identification number (Polynucleotide SEQ ID NO:) and an Incyte polynucleotide consensus sequence number (Incyte Polynucleotide ID) as shown.

[0195] Table 2 shows sequences with homology to the polypeptides of the invention as identified by BLAST analysis against the GenBank protein (genpept) database. Columns 1 and 2 show the polypeptide sequence identification number (Polypeptide SEQ ID NO:) and the corresponding Incyte polypeptide sequence number (Incyte Polypeptide ID) for polypeptides of the invention. Column 3 shows the GenBank identification number (GenBank ID NO:) of the nearest GenBank homolog. Column 4 shows the probability scores for the matches between each polypeptide and its homolog(s). Column 5 shows the annotation of the GenBank homolog(s) along with relevant citations where applicable, all of which are expressly incorporated by reference herein.

[0196] Table 3 shows various structural features of the polypeptides of the invention. Columns 1 and 2 show the polypeptide sequence identification number (SEQ ID NO:) and the corresponding Incyte polypeptide sequence number (Incyte Polypeptide ID) for each polypeptide of the invention. Column 3 shows the number of amino acid residues in each polypeptide. Column 4 shows potential phosphorylation sites, and column 5 shows potential glycosylation sites, as determined by the MOTIFS program of the GCG sequence analysis software package (Genetics Computer Group, Madison Wis.). Column 6 shows amino acid residues comprising signature sequences, domains, and motifs. Column 7 shows analytical methods for protein structure/function analysis and in some cases, searchable databases to which the analytical methods were applied.

[0197] Together, Tables 2 and 3 summarize the properties of polypeptides of the invention, and these properties establish that the claimed polypeptides are intracellular signaling molecules. For example, SEQ ID NO:1 is 31% identical, from residue E807 to residue T1198, to a human guanine nucleotide regulatory protein (GenBank ID g393095) as determined by the Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 1.4e-28, which indicates the probability of obtaining the observed polypeptide sequence alignment by chance. SEQ ID NO:1 also contains a RhoGAP domain as determined by searching for statistically significant matches in the hidden Markov model (HMM)-based PFAM database of conserved protein family domains. (See Table 3.) Data from BUMPS and additional BLAST analyses provide further corroborative evidence that SEQ ID NO:1 is a GTPase regulatory protein.

[0198] In another example, SEQ ID NO:5 is 95% identical, from residue M1 to residue 1346, to human reticulocalbin (GenBank ID g1262329) as determined by the Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 2.0e-177, which indicates the probability of obtaining the observed polypeptide sequence alignment by chance. SEQ ID NO:5 also contains EF hand domains as determined by searching for statistically significant matches in the hidden Markov model (HMM)-based PFAM database of conserved protein family domains. (See Table 3.) Data from BUMPS and MOTIFS analysis provide further corroborative evidence that SEQ ID NO:5 is a calcium binding protein

[0199] In yet another example, SEQ ID NO:8 is 62% identical, from residue W97 to residue Y569, to a human ras GTPase-activating-like protein domain (GenBank ID g536844) as determined by the Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 1.5e-172, which indicates the probability of obtaining the observed polypeptide sequence alignment by chance. SEQ ID NO:8 also contains an IQ calmodulin-binding motif and a GTPase-activator protein for Ras-like GTPase domain as determined by searching for statistically significant matches in the hidden Markov model (1-based PFAM database of conserved protein family domains. (See Table 3.) Data from BLAST-PRODOMO and -DOMO analyses provide further corroborative evidence that SEQ ID NO:8 is a GTPase-activating protein which mediates signal transduction between a particular class of extracellular receptors, the G-protein coupled receptors (GPCRs), and intracellular second messengers such as cAMP and Ca²⁺.

[0200] In another example, SEQ ID NO:10 is 35% identical, from residue L5 to residue L195, to human tubby super-family protein (GenBank ID g9858154) as determined by the Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 3.7e-24, which indicates the probability of obtaining the observed polypeptide sequence alignment by chance. SEQ ID NO:10 also contains WD domain G-beta repeats as determined by searching for statistically significant matches in the hidden Markov model (MM)-based PFAM database of conserved protein family domains. (See Table 3.) Data from BLAST analysis of the PRODOM database provide further corroborative evidence that SEQ ID NO:10 is a tubby super-family protein In another example, SEQ ID NO:12 is 89% identical, from residue A82 to residue W1120, to mouse Dbs, a homolog of the guanine nucleotide exchange factor Db1 (GenBank ID g9755425) as determined by the Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 0.0, which indicates the probability of obtaining the observed polypeptide sequence alignment by chance. SEQ ID NO:12 also contains a PH domain, a RhoGEF domain, and a spectrin repeat as determined by searching for statistically significant matches in the bidden Markov model (H)-based PFAM database of conserved protein family domains. (See Table 3.) Data from BLIMPS, MOTIFS, and further BLAST analyses provide further corroborative evidence that SEQ ID NO:12 is a guanine nucleotide exchange factor.

[0201] In another example, SEQ ID NO:18 is 66% identical, from residue P5 to residue K628, to guanylate binding protein (GenBank ID g193444) as determined by the Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 5.1e-220, which indicates the probability of obtaining the observed polypeptide sequence alignment by chance. SEQ ID NO: 18 also contains guanylate binding protein, N- and C-terminal domains as determined by searching for statistically significant matches in the hidden Markov model M)-based PFAM database of conserved protein family domains. (See Table 3.) Data from MOTIFS, and additional BLAST analyses provide further corroborative evidence that SEQ ID NO:18 is a guanylate binding protein.

[0202] In another example, SEQ ID NO:19 is 37% identical, from residue G1146 to residue S1527, to C. elegan GTPase-activating protein (GenBank ID g⁴37181) as determined by the Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 9.2e-60, which indicates the probability of obtaining the observed polypeptide sequence alignment by chance. SEQ ID NO:19 also contains PDZ, PH, and RhoGAP domains as determined by searching for statistically significant matches in the hidden Markov model (HMM)-based PFAM database of conserved protein family domains. (See Table 3.) Data from BUMPS, MOTIFS, and additional BLAST analyses provide further corroborative evidence that SEQ ID NO: 19 is a GTPase-activating protein.

[0203] In yet another example, SEQ ID NO:22 is 47% identical, from residue V45 to residue H211, to human soluble adenylyl cyclase (GenBank ID g7650188) as determined by the Basic Local Alignment Search Tool (BLAST). (See Table 2.) The BLAST probability score is 1.1e-33, which indicates the probability of obtaining the observed polypeptide sequence alignment by chance. SEQ ID NO:22 also contains an adenylate and guanlylate cyclase active site domain as determined by searching for statistically significant matches in the hidden Markov model (HMM)-based PFAM database of conserved protein family domains. (See Table 3.) Data from MOTIFS analysis provide further corroborative evidence that SEQ ID NO:22 is an adenylate cyclase. SEQ ID NO:24, SEQ ID NO:6-7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13-17, SEQ ID NO:20-21, and SEQ ID NO:23 were analyzed and annotated in a similar manner. The algorithms and parameters for the analysis of SEQ ID NO:1-23 are described in Table 7.

[0204] As shown in Table 4, the full length polynucleotide embodiments were assembled using cDNA sequences or coding (exon) sequences derived from genomic DNA, or any combination of these two types of sequences. Column 1 lists the polynucleotide sequence identification number (Polynucleotide SEQ ID NO:), the corresponding Incyte polynucleotide consensus sequence number (Incyte ID) for each polynucleotide of the invention, and the length of each polynucleotide sequence in basepairs. Column 2 shows the nucleotide start (5′) and stop (3′) positions of the cDNA and/or genomic sequences used to assemble the full length polynucleotide embodiments, and of fragments of the polynucleotides which are useful, for example, in hybridization or amplification technologies that identify SEQ ID NO:24-46 or that distinguish between SEQ ID NO:24-46 and related polynucleotides.

[0205] The polynucleotide fragments described in Column 2 of Table 4 may refer specifically, for example, to Incyte cDNAs derived from tissue-specific cDNA libraries or from pooled cDNA libraries. Alternatively, the polynucleotide fragments described in column 2 may refer to GenBank cDNAs or ESTs which contributed to the assembly of the full length polynucleotides. In addition, the polynucleotide fragments described in column 2 may identify sequences derived from the ENSEMBL (The Sanger Centre, Cambridge, UK) database (i.e., those sequences including the designation ‘ENST’). Alternatively, the polynucleotide fragments described in column 2 may be derived from the NCBI RefSeq Nucleotide Sequence Records Database (i.e., those sequences including the designation “NM” or “NT”) or the NCBI RefSeq Protein Sequence Records (i.e., those sequences including the designation “NP”). Alternatively, the polynucleotide fragments described in column 2 may refer to assemblages of both cDNA and Genscan-predicted exons brought together by an “exon stitching” algorithm. For example, a polynucleotide sequence identified as FL_XXXXXX_N₁ _(—) N₂ _(—) YYYYY_N₃ _(—) N₄ represents a “stitched” sequence in which X is the identification number of the cluster of sequences to which the algorithm was applied, and YYYYY is the number of the prediction generated by the algorithm, and N_(1,2,3) . . . , if present, represent specific exons that may have been manually edited during analysis (See Example V). Alternatively, the polynucleotide fragments in column 2 may refer to assemblages of exons brought together by an “exon-stretching” algorithm. For example, a polynucleotide sequence identified as FLXXXXX_gAAAAA_BBBBB_(—)1_N is a “stretched” sequence, with XXXXXX being the Incyte project identification number, gAAAAA being the GenBank identification number of the human genomic sequence to which the “exon-stretching,” algorithm was applied, gBBBBB being the GenBank identification number or NCBI RefSeq identification number of the nearest GenBank protein homolog, and N referring to specific exons (See Example V). In instances where a RefSeq sequence was used as a protein homolog for the “exon-stretching” algorithm, a RefSeq identifier (denoted by “NM,” “NP,” or ‘I’) may be used in place of the GenBank identifier (i.e., gBBBBB).

[0206] Alternatively, a prefix identifies component sequences that were hand-edited, predicted from genomic DNA sequences, or derived from a combination of sequence analysis methods. The following Table lists examples of component sequence prefixes and corresponding sequence analysis methods associated with the prefixes (see Example IV and Example V). Prefix Type of analysis and/or examples of programs GNN, Exon prediction from genomic sequences using, for GFG, example, GENSCAN (Stanford University, CA, USA) ENST or FGENES (Computer Genomics Group, The Sanger Centre, Cambridge, UK). GBI Hand-edited analysis of genomic sequences. FL Stitched or stretched genomic sequences (see Example V). INCY Full length transcript and exon prediction from mapping of EST sequences to the genome. Genomic location and EST composition data are combined to predict the exons and resulting transcript.

[0207] In some cases, Incyte cDNA coverage redundant with the sequence coverage shown in Table 4 was obtained to confirm the final consensus polynucleotide sequence, but the relevant Incyte cDNA identification numbers are not shown.

[0208] Table 5 shows the representative cDNA libraries for those full length polynucleotides which were assembled using Incyte cDNA sequences. The representative cDNA library is the Incyte cDNA library which is most frequently represented by the Incyte cDNA sequences which were used to assemble and confirm the above polynucleotides. The tissues and vectors which were used to construct the cDNA libraries shown in Table 5 are described in Table 6.

[0209] The invention also encompasses INTSIG variants. A preferred INTSIG variant is one which has at least about 80%, or alternatively at least about 90%, or even at least about 95% amino acid sequence identity to the INTSIG amino acid sequence, and which contains at least one functional or structural characteristic of INTSIG.

[0210] Various embodiments also encompass polynucleotides which encode INTSIG. In a particular embodiment, the invention encompasses a polynucleotide sequence comprising a sequence selected from the group consisting of SEQ ID NO:24-46, which encodes INTSIG. The polynucleotide sequences of SEQ ID NO:2446, as presented in the Sequence Listing, embrace the equivalent RNA sequences, wherein occurrences of the nitrogenous base thymine are replaced with uracil, and the sugar backbone is composed of ribose instead of deoxyribose.

[0211] The invention also encompasses variants of a polynucleotide encoding INTSIG. In particular, such a variant polynucleotide will have at least about 70%, or alternatively at least about 85%, or even at least about 95% polynucleotide sequence identity to a polynucleotide encoding INTSIG. A particular aspect of the invention encompasses a variant of a polynucleotide comprising a sequence selected from the group consisting of SEQ ID NO:24-46 which has at least about 70%, or alternatively at least about 85%, or even at least about 95% polynucleotide sequence identity to a nucleic acid sequence selected from the group consisting of SEQ ID NO:24-46. Any one of the polynucleotide variants described above can encode a polypeptide which contains at least one functional or structural characteristic of INTSIG.

[0212] In addition, or in the alternative, a polynucleotide variant of the invention is a splice variant of a polynucleotide encoding INTSIG. A splice variant may have portions which have significant sequence identity to a polynucleotide encoding INTSIG, but will generally have a greater or lesser number of polynucleotides due to additions or deletions of blocks of sequence arising from alternate splicing of exons during mRNA processing. A splice variant may have less than about 70%, or alternatively less than about 60%, or alternatively less than about 50% polynucleotide sequence identity to a polynucleotide encoding INTSIG over its entire length; however, portions of the splice variant will have at least about 70%, or alternatively at least about 85%, or alternatively at least about 95%, or alternatively 100% polynucleotide sequence identity to portions of the polynucleotide encoding INTSIG. Any one of the splice variants described above can encode a polypeptide which contains at least one functional or structural characteristic of INTSIG.

[0213] It will be appreciated by those skilled in the art that as a result of the degeneracy of the genetic code, a multitude of polynucleotide sequences encoding INTSIG, some bearing minimal similarity to the polynucleotide sequences of any known and naturally occurring gene, may be produced. Thus, the invention contemplates each and every possible variation of polynucleotide sequence that could be made by selecting combinations based on possible codon choices. These combinations are made in accordance with the standard triplet genetic code as applied to the polynucleotide sequence of naturally occurring INTSIG, and all such variations are to be considered as being specifically disclosed.

[0214] Although polynucleotides which encode INTSIG and its variants are generally capable of hybridizing to polynucleotides encoding naturally occurring INTSIG under appropriately selected conditions of stringency, it may be advantageous to produce polynucleotides encoding INTSIG or its derivatives possessing a substantially different codon usage, e.g., inclusion of non-naturally occurring codons. Codons may be selected to increase the rate at which expression of the peptide occurs in a particular prokaryotic or eukaryotic host in accordance with the frequency with which particular codons are utilized by the host Other reasons for substantially altering the nucleotide sequence encoding INTSIG and its derivatives without altering the encoded amino acid sequences include the production of RNA transcripts having more desirable properties, such as a greater half-life, than transcripts produced from the naturally occurring sequence.

[0215] The invention also encompasses production of polynucleotides which encode INTSIG and INTSIG derivatives, or fragments thereof, entirely by synthetic chemistry. After production, the synthetic polynucleotide may be inserted into any of the many available expression vectors and cell systems using reagents well known in the art. Moreover, synthetic chemistry may be used to introduce mutations into a polynucleotide encoding INTSIG or any fragment thereof.

[0216] Embodiments of the invention can also include polynucleotides that are capable of hybridizing to the claimed polynucleotides, and, in particular, to those having the sequences shown in SEQ ID NO:2446 and fragments thereof, under various conditions of stringency (Wahl, G. M. and S. L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A. R. (1987) Methods Enzymol. 152:507-511). Hybridization conditions, including annealing and wash conditions, are described in “Definitions.”

[0217] Methods for DNA sequencing are well known in the art and may be used to practice any of the embodiments of the invention. The methods may employ such enzymes as the Klenow fragment of DNA polymerase I, SEQUENASE (US Biochemical, Cleveland Ohio), Taq polymerase (Applied Biosystems), thermostable T7 polymerase (Amersham Biosciences, Piscataway N.J.), or combinations of polymerases and proofreading exonucleases such as those found in the ELONGASE amplification system (Invitrogen, Carlsbad Calif.). Preferably, sequence preparation is automated with machines such as the MICROLAB 2200 liquid transfer system (Hamilton, Reno Nev.), PTC200 thermal cycler (MJ Research, Watertown Mass.) and ABI CATALYST 800 thermal cycler (Applied Biosystems). Sequencing is then carried out using either the ABI 373 or 377 DNA sequencing system (Applied Biosystems), the MEGABACE 1000 DNA sequencing system (Amersham Biosciences), or other systems known in the art, The resulting sequences are analyzed using a variety of algorithms which are well known in the art (Ausubel et al., supra, ch. 7; Meyers, R. A. (1995) Molecular Biology and Biotechnology, Wiley V C H, New York N.Y., pp.856-853).

[0218] The nucleic acids encoding INTSIG may be extended utilizing a partial nucleotide sequence and employing various PCR-based methods known in the art to detect upstream sequences, such as promoters and regulatory elements. For example, one method which may be employed, restriction-site PCR, uses universal and nested primers to amplify unknown sequence from genomic DNA within a cloning vector (Sarkar, G. (1993) PCR Methods Applic. 2:318-322). Another method, inverse PCR, uses primers that extend in divergent directions to amplify unknown sequence from a circularized template. The template is derived from restriction fragments comprising a known genomic locus and surrounding sequences (Triglia, T. et al. (1988) Nucleic Acids Res. 16:8186). A third method, capture PCR, involves PCR amplification of DNA fragments adjacent to known sequences inhuman and yeast artificial chromosome DNA (Lagerstrom, M. et al. (1991) PCR Methods Applic. 1:111-119). In this method, multiple restriction enzyme digestions and ligations may be used to insert an engineered double-stranded sequence into a region of unknown sequence before performing PCR. Other methods which may be used to retrieve unknown sequences are known in the art (Parker, J. D. et al. (1991) Nucleic Acids Res. 19:3055-3060). Additionally, one may use PCR, nested primers, and PROMOTERFINDER libraries (Clontech, Palo Alto Calif.) to walk genomic DNA. This procedure avoids the need to screen libraries and is useful in finding intron/exon junctions. For all PCR-based methods, primers may be designed using commercially available software, such as OLIGO 4.06 primer analysis software (National Biosciences, Plymouth I) or another appropriate program, to be about 22 to 30 nucleotides in length, to have a GC content of about 50% or more, and to anneal to the template at temperatures of about 68° C. to 72° C.

[0219] When screening for full length cDNAs, it is preferable to use libraries that have been size-selected to include larger cDNAs. In addition, random-primed libraries, which often include sequences containing the 5′ regions of genes, are preferable for situations in which an oligo d(T) library does not yield a full-length cDNA. Genomic libraries may be useful for extension of sequence into 5′ non-transcribed regulatory regions.

[0220] Capillary electrophoresis systems which are commercially available may be used to analyze the size or confirm the nucleotide sequence of sequencing or PCR products. In particular, capillary sequencing may employ flowable polymers for electrophoretic separation, four different nucleotide-specific, laser-stimulated fluorescent dyes, and a charge coupled device camera for detection of the emitted wavelengths. Output/light intensity may be converted to electrical signal using appropriate software (e.g., GENOTYPER and SEQUENCE NAVIGATOR, Applied Biosystems), and the entire process from loading of samples to computer analysis and electronic data display may be computer controlled. Capillary electrophoresis is especially preferable for sequencing small DNA fragments which may be present in limited amounts in a particular sample.

[0221] In another embodiment of the invention, polynucleotides or fragments thereof which encode INTSIG may be cloned in recombinant DNA molecules that direct expression of INTSIG, or fragments or functional equivalents thereof, in appropriate host cells. Due to the inherent degeneracy of the genetic code, other polynucleotides which encode substantially the same or a functionally equivalent polypeptides may be produced and used to express INTSIG.

[0222] The polynucleotides of the invention can be engineered using methods generally known in the art in order to alter INTSIG-encoding sequences for a variety of purposes including, but not limited to, modification of the cloning, processing, and/or expression of the gene product. DNA shuffling by random fragmentation and PCR reassembly of gene fragments and synthetic oligonucleotides may be used to engineer the nucleotide sequences. For example, oligonucleotide-mediated site-directed mutagenesis may be used to introduce mutations that create new restriction sites, alter glycosylation patterns, change codon preference, produce splice variants, and so forth.

[0223] The nucleotides of the present invention may be subjected to DNA shuffling techniques such as MOLECULARBREEDING (Maxygen Inc., Santa Clara Calif.; described in U.S. Pat. No. 5,837,458; Chang, C.-C. et al. (1999) Nat Biotechnol. 17:793-797; Christians, F. C. et al. (1999) Nat. Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat. Biotechnol. 14:315-319) to alter or improve the biological properties of INTSIG, such as its biological or enzymatic activity or its ability to bind to other molecules or compounds. DNA shuffling is a process by which a library of gene variants is produced using PCR-mediated recombination of gene fragments. The library is then subjected to selection or screening procedures that identify those gene variants with the desired properties. These preferred variants may then be pooled and further subjected to recursive rounds of DNA shuffling and selection/screening. Thus, genetic diversity is created through “artificial” breeding and rapid molecular evolution For example, fragments of a single gene containing random point mutations may be recombined, screened, and then reshuffled until the desired properties are optimized. Alternatively, fragments of a given gene may be recombined with fragments of homologous genes in the same gene family, either from the same or different species, thereby maximizing the genetic diversity of multiple naturally occurring genes in a directed and controllable manner. In another embodiment, polynucleotides encoding INTSIG may be synthesized, in whole or in part, using one or more chemical methods well known in the art (Caruthers, M. H. et al. (1980) Nucleic Acids Symp. Ser. 7:215-223; Horn, T. et al. (1980) Nucleic Acids Symp. Ser. 7:225-232). Alternatively, INTSIG itself or a fragment thereof may be synthesized using chemical methods known in the art For example, peptide synthesis can be performed using various solution-phase or solid-phase techniques (Creighton, T. (1984) Proteins, Structures and Molecular Properties, W H Freeman, New York N.Y., pp. 55-60; Roberge, J. Y. et al. (1995) Science 269:202-204). Automated synthesis may be achieved using the ABI 431A peptide synthesizer (Applied Biosystems). Additionally, the amino acid sequence of INTSIG, or any part thereof, may be altered during direct synthesis and/or combined with sequences from other proteins, or any part thereof, to produce a variant polypeptide or a polypeptide having a sequence of a naturally occurring polypeptide.

[0224] The peptide may be substantially purified by preparative high performance liquid chromatography (Chiez, R. M. and F. Z. Regnier (1990) Methods Enzymol 182:392-421). The composition of the synthetic peptides may be confirmed by amino acid analysis or by sequencing. (Creighton, supra, pp. 28-53).

[0225] In order to express a biologically active INTSIG, the polynucleotides encoding INTSIG or derivatives thereof may be inserted into an appropriate expression vector, i.e., a vector which contains the necessary elements for transcriptional and translational control of the inserted coding sequence in a suitable host These elements include regulatory sequences, such as enhancers, constitutive and inducible promoters, and 5′ and 3′untranslated regions in the vector and in polynucleotides encoding INTSIG. Such elements may vary in their strength and specificity. Specific initiation signals may also be used to achieve more efficient translation of polynucleotides encoding INTSIG. Such signals include the ATG initiation codon and adjacent sequences, e.g. the Kozak sequence. In cases where a polynucleotide sequence encoding INTSIG and its initiation codon and upstream regulatory sequences are inserted into the appropriate expression vector, no additional transcriptional or translational control signals may be needed. However, in cases where only coding sequence, or a fragment thereof, is inserted, exogenous translational control signals including an in-frame ATG initiation codon should be provided by the vector. Exogenous translational elements and initiation codons may be of various origins, both natural and synthetic. The efficiency of expression may be enhanced by the inclusion of enhancers appropriate for the particular host cell system used (Scharf, D. et al. (1994) Results Probl. Cell Differ. 20:125-162).

[0226] Methods which are well known to those skilled in the art may be used to construct expression vectors containing polynucleotides encoding INTSIG and appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination (Sambrook, J. et al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, Plainview N.Y., ch. 4, 8, and 16-17; Ausubel et at, supra, ch 1, 3, and 15).

[0227] A variety of expression vector/host systems may be utilized to contain and express polynucleotides encoding INTSIG. These include, but are not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems infected with viral expression vectors (e.g., baculovirus); plant cell systems transformed with viral expression vectors (e.g., cauliflower mosaic virus, C or tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g., Ti or pBR322 plasmids); or animal cell systems (Sambrook, supra; Ausubel et al, supra; Van Heeke, G. and S. M. Schuster (1989) J. Biol. Chem. 264:5503-5509; Engelhard, E. K. et al. (1994) Proc. Natl. Aced. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum. Gene Ther. 7:1937-1945; Takamatsu, N. (1987) EMBO J. 6:307-311; The McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, New York N.Y., pp. 191-196; Logan, J. and T. Shenk (1984) Proc. Natl. Acad. Sci. USA 81:3655-3659; Harrington, J. J. et al. (1997) Nat Genet 15:345-355). Expression vectors derived from retroviruses, adenoviruses, or herpes or vaccinia viruses, or from various bacterial plasmids, may be used for delivery of polynucleotides to the targeted organ, tissue, or cell population (Di Nicola, M. et al. (1998) Cancer Gen. Ther. 5:350-356; Yu, M. et al. (1993) Proc. Natl. Acad. Sci. USA 90:6340-6344; Buller, R. M. et al (1985) Nature 317:813-815; McGregor, D. P. et al. (1994) Mol. Immunol. 31:219-226; Verma, I. M. and N. Somia (1997) Nature 389:239-242). The invention is not limited by the host cell employed.

[0228] In bacterial systems, a number of cloning and expression vectors may be selected depending upon the use intended for polynucleotides encoding INTSIG. For example, routine cloning, subcloning, and propagation of polynucleotides encoding INTSIG can be achieved using a multifunctional E. coli vector such as PBLUESCRIPT (Stratagene, La Jolla Calif.) or PSPORT1 plasmid (Invitrogen). Ligation of polynucleotides encoding INTSIG into the vector's multiple cloning site disrupts the lacZ gene, allowing a calorimetric screening procedure for identification of transformed bacteria containing recombinant molecules. In addition, these vectors may be useful for in vitro transcription, dideoxy sequencing, single strand rescue with helper phage, and creation of nested deletions in the cloned sequence (Van Heeke, G. and S. M. Schuster (1989) J. Biol. Chem. 264:5503-5509). When large quantities of INTSIG are needed, e.g. for the production of antibodies, vectors which direct high level expression of INTSIG may be used. For example, vectors containing the strong, inducible SP6 or T7 bacteriophage promoter may be used.

[0229] Yeast expression systems may be used for production of INTSIG. A number of vectors containing constitutive or inducible promoters, such as alpha factor, alcohol oxidase, and PGH promoters, may be used in the yeast Saccharomyces cerevisiae or Pichia pastoris. In addition, such vectors direct either the secretion or intracellular retention of expressed proteins and enable integration of foreign polynucleotide sequences into the host genome for stable propagation (Ausubel et al., supra; Bitter, G. A. et al. (1987) Methods Enzymol. 153:516-544; Scorer, C. A. et al. (1994) Bio/Technology 12:181-184).

[0230] Plant systems may also be used for expression of INTSIG. Transcription of polynucleotides encoding INTSIG may be driven by viral promoters, e.g., the ³⁵S and 19S promoters of CaMV used alone or in combination with the omega leader sequence from TMV (Takamatsu, N. (1987) EMBO J. 6:307-311). Alternatively, plant promoters such as the small subunit of RUBISCO or heat shock promoters may be used (Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; Broglie, R et al. (1984) Science 224:838-843; Winter, J. et al. (1991) Results Probl. Cell Differ. 17:85-105). These constructs can be introduced into plant cells by direct DNA transformation or pathogen-mediated transfection (The McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, New York N.Y., pp. 191-196).

[0231] In mammalian cells, a number of viral-based expression systems may be utilized. In cases where an adenovirus is used as an expression vector, polynucleotides encoding INTSIG may be ligated into an adenovirus transcription/translation complex consisting of the late promoter and tripartite leader sequence. Insertion in a non-essential E1 or E3 region of the viral genome may be used to obtain infective virus which expresses INTSIG in host cells (Logan, J. and T. Shenk (1984) Proc. Natl. Acad. Sci. USA 81:3655-3659). In addition, transcription enhancers, such as the Rous sarcoma virus (RSV) enhancer, may be used to increase expression in mammalian host cells. SV40 or EBV-based vectors may also be used for high-level protein expression.

[0232] Human artificial chromosomes (HACs) may also be employed to deliver larger fragments of DNA than can be contained in and expressed from a plasmid. HACs of about 6 kb to 10 Mb are constructed and delivered via conventional delivery methods (liposomes, polycationic amino polymers, or vesicles) for therapeutic purposes (Harrington, J. J. et al. (1997) Nat. Genet. 15:345-355).

[0233] For long term production of recombinant proteins in mammalian systems, stable expression of INTSIG in cell lines is preferred. For example, polynucleotides encoding INTSIG can be transformed into cell lines using expression vectors which may contain viral origins of replication and/or endogenous expression elements and a selectable marker gene on the same or on a separate vector. Following the introduction of the vector, cells may be allowed to grow for about 1 to 2 days in enriched media before being switched to selective media. The purpose of the selectable marker is to confer resistance to a selective agent, and its presence allows growth and recovery of cells which successfully express the introduced sequences. Resistant clones of stably transformed cells may be propagated using tissue culture techniques appropriate to the cell type.

[0234] Any number of selection systems may be used to recover transformed cell lines. These include, but are not limited to, the herpes simplex virus thymidine kinase and adenine phosphoribosyltransferase genes, for use in tk and apr cells respectively (Wigler, M. et al (1977) Cell 11:223-232; Lowy, I. et al (1980) Cell 22:817-823). Also, antimetabolite, antibiotic, or herbicide resistance can be used as the basis for selection. For example, dhfr confers resistance to methotrexate; neo confers resistance to the aminoglycosides neomycin and G418; and als and pat confer resistance to chlorsulfuron and phosphinotricin acetyltransferase, respectively (Wigler, M. et al. (1980) Proc. Natl. Acad. Sci. USA 77:3567-3570; Colbere-Garapin, F. et al. (1981) J. Mol. Biol. 150:1-14). Additional selectable genes have been described, e.g., trpb and hisD, which alter cellular requirements for metabolites (Hartman, S. C. and R. C. Mulligan (1988) Proc. Natl. Acad. Sci. USA 85:8047-8051). Visible markers, e.g., anthocyanins, green fluorescent proteins (GFP; Clontech), β-glucuronidase and its substrate β-glucuronide, or luciferase and its substrate luciferin may be used. These markers can be used not only to identify transformants, but also to quantify the amount of transient or stable protein expression attributable to a specific vector system (Rhodes, C. A. (1995) Methods Mol. Biol. 55:121-131).

[0235] Although the presence/absence of marker gene expression suggests that the gene of interest is also present, the presence and expression of the gene may need to be confirmed. For example, if the sequence encoding INTSIG is inserted within a marker gene sequence, transformed cells containing polynucleotides encoding INTSIG can be identified by the absence of marker gene function. Alternatively, a marker gene can be placed in tandem with a sequence encoding INTSIG under the control of a single promoter. Expression of the marker gene in response to induction or selection usually indicates expression of the tandem gene as well.

[0236] In general, host cells that contain the polynucleotide encoding INTSIG and that express INTSIG may be identified by a variety of procedures known to those of skill in the art. These procedures include, but are not limited to, DNA-DNA or DNA-RNA hybridizations, PCR amplification, and protein bioassay or immunoassay techniques which include membrane, solution, or chip based technologies for the detection and/or quantification of nucleic acid or protein sequences. Immunological methods for detecting and measuring the expression of INTSIG using either specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques include enzyme-linked immunosorbent assays (ELISAs), radioimmunoassays (RIAs), and fluorescence activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to two non-interfering epitopes on INTSIG is preferred, but a competitive binding assay may be employed. These and other assays are well known in the art (Hampton, R et al. (1990) Serological Methods, a Laboratory Manual, APS Press, St Paul Minn., Sect. IV; Coligan, J. E. et al. (1997) Current Protocols in Immunology, Greene Pub. Associates and Wiley-Interscience, New York N.Y.; Pound, J. D. (1998) Immunochemical Protocols, Humana Press, Totowa N.J.).

[0237] A wide variety of labels and conjugation techniques are known by those skilled in the art and may be used in various nucleic acid and amino acid assays. Means for producing labeled hybridization or PCR probes for detecting sequences related to polynucleotides encoding INTSIG include oligolabeling, nick translation, end-labeling, or PCR amplification using a labeled nucleotide. Alternatively, polynucleotides encoding INTSIG, or any fragments thereof, may be cloned into a vector for the production of an mRNA probe. Such vectors are known in the art, are commercially available, and may be used to synthesize RNA probes in vitro by addition of an appropriate RNA polymerase such as T7, T3, or SP6 and labeled nucleotides. These procedures may be conducted using a variety of commercially available kits, such as those provided by Amersham Biosciences, Promega (Madison Wis.), and US Biochemical Suitable reporter molecules or labels which may be used for ease of detection include radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic agents, as well as substrates, cofactors, inhibitors, magnetic particles, and the like.

[0238] Host cells transformed with polynucleotides encoding INTSIG may be cultured under conditions suitable for the expression and recovery of the protein from cell culture. The protein produced by a transformed cell may be secreted or retained intracellularly depending on the sequence and/or the vector used. As will be understood by those of skill in the art, expression vectors containing polynucleotides which encode INTSIG may be designed to contain signal sequences which direct secretion of INTSIG through a prokaryotic or eukaryotic cell membrane.

[0239] In addition, a host cell strain may be chosen for its ability to modulate expression of the inserted polynucleotides or to process the expressed protein in the desired fashion. Such modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation, and acylation. Post-translational processing which cleaves a “prepro” or “pro” form of the protein may also be used to specify protein targeting, folding, and/or activity. Different host cells which have specific cellular machinery and characteristic mechanisms for post-translational activities (e.g., CHO, HeLa, MDCK, HEK293, and W138) are available from the American Type Culture Collection (ATCC, Manassas Va.) and may be chosen to ensure the correct modification and processing of the foreign protein.

[0240] In another embodiment of the invention, natural, modified, or recombinant polynucleotides encoding INTSIG may be ligated to a heterologous sequence resulting in translation of a fusion protein in any of the aforementioned host systems. For example, a chimeric INTSIG protein containing a heterologous moiety that can be recognized by a commercially available antibody may facilitate the screening of peptide libraries for inhibitors of INTSIG activity. Heterologous protein and peptide moieties may also facilitate purification of fusion proteins using commercially available affinity matrices. Such moieties include, but are not limited to, glutathione S-transferase (GST), maltose binding protein (MBP), thioredoxin (Trx), calmodulin binding peptide (CBP), 6-His, FLAG, c-myc, and hemagglutinin (HA). GST, MBP, Trx, CBP, and 6-His enable purification of their cognate fusion proteins on immobilized glutathione, maltose, phenylarsine oxide, calmodulin, and metal-chelate resins, respectively. FLAG, c-myc, and hemagglutinin (HA) enable immunoaffinity purification of fusion proteins using commercially available monoclonal and polyclonal antibodies that specifically recognize these epitope tags. A fusion protein may also be engineered to contain a proteolytic cleavage site located between the INTSIG encoding sequence and the heterologous protein sequence, so that INTSIG may be cleaved away from the heterologous moiety following purification. Methods for fusion protein expression and purification are discussed in Ausubel et al. (supra, ch. 10 and 16). A variety of commercially available kits may also be used to facilitate expression and purification of fusion proteins.

[0241] In another embodiment, synthesis of radiolabeled INTSIG may be achieved in vitro using the TNT rabbit reticulocyte lysate or wheat germ extract system (Promega). These systems couple transcription and translation of protein-coding sequences operably associated with the T7, T3, or SP6 promoters. Translation takes place in the presence of a radiolabeled amino acid precursor, for example, ³⁵S-methionine.

[0242] INTSIG, fragments of INTSIG, or variants of INTSIG may be used to screen for compounds that specifically bind to INTSIG. One or more test compounds may be screened for specific binding to INTSIG. In various embodiments, 1, 2, 3, 4, 5, 10, 20, 50, 100, or 200 test compounds can be screened for specific binding to INTSIG. Examples of test compounds can include antibodies, anticalins, oligonucleotides, proteins (e.g., ligands or receptors), or small molecules.

[0243] In related embodiments, variants of INTSIG can be used to screen for binding of test compounds, such as antibodies, to INTSIG, a variant of INTSIG, or a combination of INTSIG and/or one or more variants INTSIG. In an embodiment, a variant of INTSIG can be used to screen for compounds that bind to a variant of INTSIG, but not to INTSIG having the exact sequence of a sequence of SEQ ID NO:1-23. INTSIG variants used to perform such screening can have a range of about 50% to about 99% sequence identity to INTSIG, with various embodiments having 60%, 70%, 75%, 80%, 85%, 90%, and 95% sequence identity.

[0244] In an embodiment, a compound identified in a screen for specific binding to INTSIG can be closely related to the natural ligand of INTSIG, e.g., a ligand or fragment thereof, a natural substrate, a structural or functional mimetic, or a natural binding partner (Coligan, J. E. et al. (1991) Current Protocols in Immunology 1(2):Chapter 5). In another embodiment, the compound thus identified can be a natural ligand of a receptor INTSIG (Howard, A. D. et al. (2001) Trends Pharmacol. Sci.22:132-140; Wise, A. et al (2002) Drug Discovery Today 7:235-246).

[0245] In other embodiments, a compound identified in a screen for specific binding to INTSIG can be closely related to the natural receptor to which INTSIG binds, at least a fragment of the receptor, or a fragment of the receptor including all or a portion of the ligand binding site or binding pocket. For example, the compound may be a receptor for INTSIG which is capable of propagating a signal, or a decoy receptor for INTSIG which is not capable of propagating a signal (Ashkenazi, A. and V. M. Divit (1999) Curr. Opin. Cell Biol. 11:255-260; Mantovani, A. et al. (2001) Trends Immunol. 22:328-336). The compound can be rationally designed using known techniques. Examples of such techniques include those used to construct the compound etanercept (ENBREL; Immunex Corp., Seattle Wash.), which is efficacious for treating rheumatoid arthritis in humans. Etanercept is an engineered p75 tumor necrosis factor (TNF) receptor dimer linked to the Fc portion of human IgG₁ (Taylor, P. C. et al. (2001) Curr. Opin. Immunol. 13:611-616).

[0246] In one embodiment, two or more antibodies having similar or, alternatively, different specificities can be screened for specific binding to INTSIG, fragments of INTSIG, or variants of INTSIG. The binding specificity of the antibodies thus screened can thereby be selected to identify particular fragments or variants of INTSIG. In one embodiment, an antibody can be selected such that its binding specificity allows for preferential identification of specific fragments or variants of INTSIG. In another embodiment, an antibody can be selected such that its binding specificity allows for preferential diagnosis of a specific disease or condition having increased, decreased, or otherwise abnormal production of INTSIG.

[0247] In an embodiment, anticalins can be screened for specific binding to INTSIG, fragments of INTSIG, or variants of INTSIG. Anticalins are ligand-binding proteins that have been constructed based on a lipocalin scaffold (Weiss, G. A. and H. B. Lowman (2000) Chem. Biol. 7:R177-R184; Skerra, A. (2001) J. Biotechnol. 74:257-275). The protein architecture of lipocalins can include a beta-barrel having eight antiparallel beta-strands, which supports four loops at its open end. These loops form the natural ligand-binding site of the lipocalins, a site which can be re-engineered in vitro by amino acid substitutions to impart novel binding specificities. The amino acid substitutions can be made using methods known in the art or described herein, and can include conservative substitutions (e.g., substitutions that do not alter binding specificity) or substitutions that modesty, moderately, or significantly alter binding specificity.

[0248] In one embodiment, screening for compounds which specifically bind to, stimulate, or inhibit INTSIG involves producing appropriate cells which express INTSIG, either as a secreted protein or on the cell membrane. Preferred cells include cells from mammals, yeast, Drosophila, or E. coli. Cells expressing INTSIG or cell membrane fractions which contain INTSIG are then contacted with a test compound and binding, stimulation, or inhibition of activity of either INTSIG or the compound is analyzed.

[0249] An assay may simply test binding of a test compound to the polypeptide, wherein binding is detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. For example, the assay may comprise the steps of combining at least one test compound with INTSIG, either in solution or affixed to a solid support, and detecting the binding of INTSIG to the compound. Alternatively, the assay may detect or measure binding of a test compound in the presence of a labeled competitor. Additionally, the assay may be carried out using cell-free preparations, chemical libraries, or natural product mires, and the test compound(s) may be free in solution or affixed to a solid support.

[0250] An assay can be used to assess the ability of a compound to bind to its natural ligand and/or to inhibit the binding of its natural ligand to its natural receptors. Examples of such assays include radiolabeling assays such as those described in U.S. Pat. No. 5,914,236 and U.S. Pat. No. 6,372,724. In a related embodiment, one or more amino acid substitutions can be introduced into a polypeptide compound (such as a receptor) to improve or alter its ability to bind to its natural ligands (Matthews, D. J. and J. A. Wells. (1994) Chem. Biol. 1:25-30). In another related embodiment, one or more amino acid substitutions can be introduced into a polypeptide compound (such as a ligand) to improve or alter its ability to bind to its natural receptors (Cunningham, B. C. and J. A. Wells (1991) Proc. Natl. Acad. Sci. USA 88:3407-3411; Lowman, H. B. et al. (1991) J. Biol. Chem. 266:10982-10988).

[0251] INTSIG, fragments of INTSIG, or variants of INTSIG may be used to screen for compounds that modulate the activity of INTSIG. Such compounds may include agonists, antagonists, or partial or inverse agonists. In one embodiment, an assay is performed under conditions permissive for INTSIG activity, wherein INTSIG is combined with at least one test compound, and the activity of INTSIG in the presence of a test compound is compared with the activity of INTSIG in the absence of the test compound. A change in the activity of INTSIG in the presence of the test compound is indicative of a compound that modulates the activity of INTSIG. Alternatively, a test compound is combined with an in vitro or cell-free system comprising INTSIG under conditions suitable for INTSIG activity, and the assay is performed. In either of these assays, a test compound which modulates the activity of INTSIG may do so indirectly and need not come in direct contact with the test compound. At least one and up to a plurality of test compounds may be screened.

[0252] In another embodiment, polynucleotides encoding INTSIG or their mammalian homologs may be “knocked out” in an animal model system using homologous recombination in embryonic stem (ES) cells. Such techniques are well known in the art and are useful for the generation of animal models of human disease (see, e.g., U.S. Pat. No. 5,175,383 and U.S. Pat. No. 5,767,337). For example, mouse ES cells, such as the mouse 129/SvJ cell line, are derived from the early mouse embryo and grown in culture. The ES cells are transformed with a vector containing the gene of interest disrupted by a marker gene, e.g., the neomycin phosphotransferase gene (neo; Capecchi, M. R. (1989) Science 244:1288-1292). The vector integrates into the corresponding region of the host genome by homologous recombination. Alternatively, homologous recombination takes place using the Cre-loxP system to knockout a gene of interest in a tissue- or developmental stage-specific manner (Marth, J. D. (1996) Clin. Invest. 97:1999-2002; Wagner, KU. et al. (1997) Nucleic Acids Res. 25:4323-4330). Transformed ES cells are identified and microinjected into mouse cell blastocysts such as those from the C57BL/6 mouse strain. The blastocysts are surgically transferred to pseudopregnant dams, and the resulting chimeric progeny are genotyped and bred to produce heterozygous or homozygous strains. Transgenic animals thus generated may be tested with potential therapeutic or toxic agents.

[0253] Polynucleotides encoding INTSIG may also be manipulated in vitro in ES cells derived from human blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell lineages including endoderm, mesoderm, and ectodermal cell types. These cell lineages differentiate into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J. A. et al. (1998) Science 282:1145-1147).

[0254] Polynucleotides encoding INTSIG can also be used to create “knockin” humanized animals (pigs) or transgenic animals (mice or rats) to model human disease. With knockin technology, a region of a polynucleotide encoding INTSIG is injected into animal ES cells, and the injected sequence integrates into the animal cell genome. Transformed cells are injected into blastulae, and the blastulae are implanted as described above. Transgenic progeny or inbred lines are studied and treated with potential pharmaceutical agents to obtain information on treatment of a human disease. Alternatively, a mammal inbred to overexpress INTSIG, e.g., by secreting INTSIG in its milk, may also serve as a convenient source of that protein (Janne, J. et al (1998) Biotechnol. Annu. Rev. 4:55-74).

[0255] Therapeutics

[0256] Chemical and structural similarity, e.g., in the context of sequences and motifs, exists between regions of INTSIG and intracellular signaling molecules. In addition, the expression of INTSIG is closely associated with brain tissue (including but not limited to dentate nucleus, striatum, globus pallidus, and posterior putamen tissue), gastrointestinal tissue, thyroid tissue, heart tissue, prostate tissue, adipocyte tissue, uterine tissue, lung tissue, bladder and uterine tumor tissue, bone tumor tissue, and diseased colon tissue. In addition, examples of tissues expressing INTSIG can be found in Table 6 and can also be found in Example XI. Therefore, INTSIG appears to play a role in cell proliferative, autoimmune/inflammatory, neurological, gastrointestinal, reproductive, cardiovascular, developmental, and vesicle trafficking disorders. In the treatment of disorders associated with increased INTSIG expression or activity, it is desirable to decrease the expression or activity of INTSIG. In the treatment of disorders associated with decreased INTSIG expression or activity, it is desirable to increase the expression or activity of INTSIG.

[0257] Therefore, in one embodiment, INTSIG or a fragment or derivative thereof may be administered to a subject to treat or prevent a disorder associated with decreased expression or activity of INTSIG. Examples of such disorders include, but are not limited to, a cell proliferative disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gallbladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus; an autoimmune/inflammatory disorder such as acquired immunodeficiency syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED), bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, episodic lymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's thyroiditis, hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, myocardial or pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjögren's syndrome, systemic anaphylaxis, systemic lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, viral, bacterial, fungal, parasitic, protozoan and helminthic infections, and trauma; a neurological disorder such as epilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease, Pick's disease, Huntington's disease, dementia, Parkinson's disease and other extrapyramidal disorders, amyotrophic lateral sclerosis and other motor neuron disorders, progressive neural muscular atrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinating diseases, bacterial and viral meningitis, brain abscess, subdural empyema, epidural abscess, suppurative intracranial thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion diseases including kuru, Creutzfeldt-Jakob disease, and Cerstmaim-Straussler-Scheinker syndrome, fatal familial insomnia, nutritional and metabolic diseases of the nervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental retardation and other developmental disorders of the central nervous system including Down syndrome, cerebral palsy, neuroskeletal disorders, autonomic nervous system disorders, cranial nerve disorders, spinal cord diseases, muscular dystrophy and other neuromuscular disorders, peripheral nervous system disorders, dermatomyositis and polymyositis, inherited, metabolic, endocrine, and toxic myopathies, myasthenia gravis, periodic paralysis, mental disorders including mood, anxiety, and schizophrenic disorders, seasonal affective disorder (SAD), akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, postherpetic neuralgia, Tourette's disorder, progressive supranuclear palsy, corticobasal degeneration, and familial frontotemporal dementia; a gastrointestinal disorder such as dysphagia, peptic esophagitis, esophageal spasm, esophageal stricture, esophageal carcinoma, dyspepsia, indigestion, gastritis, gastric carcinoma, anorexia, nausea, emesis, gastroparesis, antral or pyloric edema, abdominal angina, pyrosis, gastroenteritis, intestinal obstruction, infections of the intestinal tract, peptic ulcer, cholelithiasis, cholecystitis, cholestasis, pancreatitis, pancreatic carcinoma, biliary tract disease, hepatitis, hyperbilirubinemia, cirrhosis, passive congestion of the liver, hepatoma, infectious colitis, ulcerative colitis, ulcerative proctitis, Crohn's disease, Whipple's disease, Mallory-Weiss syndrome, colonic carcinoma, colonic obstruction, irritable bowel syndrome, short bowel syndrome, diarrhea, constipation, gastrointestinal hemorrhage, acquired immunodeficiency syndrome (AIDS) enteropathy, jaundice, hepatic encephalopathy, hepatorenal syndrome, hepatic steatosis, hemochromatosis, Wilson's disease, alpha₁-antitrypsin deficiency, Reye's syndrome, primary sclerosing cholangitis, liver infarction, portal vein obstruction and thrombosis, centrilobular necrosis, peliosis hepatis, hepatic vein thrombosis, veno-occlusive disease, preeclampsia, eclampsia, acute fatty liver of pregnancy, intrahepatic cholestasis of pregnancy, and hepatic tumors including nodular hyperplasias, adenomas, and carcinomas; a reproductive disorder such as a disorder of prolactin production, infertility, including tubal disease, ovulatory defects, endometriosis, a disruption of the estrous cycle, a disruption of the menstrual cycle, polycystic ovary syndrome, ovarian hyperstimulation syndrome, an endometrial or ovarian tumor, a uterine fibroid, autoimmune disorders, ectopic pregnancy, teratogenesis, cancer of the breast, fibrocystic breast disease, galactorrhea, a disruption of spermatogenesis, abnormal sperm physiology, cancer of the testis, cancer of the prostate, benign prostatic hyperplasia, prostatitis, Peyronie's disease, impotence, carcinoma of the male breast, gynecomastia, hypergonadotropic and hypogonadotropic hypogonadism, pseudohermaphroditism, azoospermia, premature ovarian failure, acrosin deficiency, delayed puperty, retrograde ejaculation and anejaculation, haemangioblastomas, cystsphaeochromocytomas, paraganglioma, cystadenomas of the epididymis, and endolymphatic sac tumours; a cardiovascular disorder such as arteriovenous fistula, atherosclerosis, hypertension, vasculitis, Raynaud's disease, aneurysms, arterial dissections, varicose veins, thrombophlebitis and phlebothrombosis, vascular tumors, and complications of thrombolysis, balloon angioplasty, vascular replacement, and coronary artery bypass graft surgery, congestive heart failure, ischemic heart disease, angina pectoris, myocardial infarction, hypertensive heart disease, degenerative valvular heart disease, calcific aortic valve stenosis, congenitally bicuspid aortic valve, mitral annular calcification, mitral valve prolapse, rheumatic fever and rheumatic heart disease, infective endocarditis, nonbacterial thrombotic endocarditis, endocarditis of systemic lupus erythematosus, carcinoid heart disease, cardiomyopathy, myocarditis, pericarditis, neoplastic heart disease, congenital heart disease, and complications of cardiac transplantation; a developmental disorder such as renal tubular acidosis, anemia, Cushing's syndrome, achondroplastic dwarfism, Duchenne and Becker muscular dystrophy, epilepsy, gonadal dysgenesis, WAGR syndrome (Wilms' tumor, aniridia, genitourinary abnormalities, and mental retardation), Smith-Magenis syndrome, myelodysplastic syndrome, hereditary mucoepithelial dysplasia, hereditary keratodermas, hereditary neuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis, hypothyroidism, hydrocephalus, seizure disorders such as Syndenham's chorea and cerebral palsy, spina bifida, anencephaly, craniorachischisis, congenital glaucoma, cataract, and sensorineural hearing loss; and a vesicle trafficking disorder such as cystic fibrosis, glucose-galactose malabsorption syndrome, hypercholesterolemia, diabetes mellitus, diabetes insipidus, hyper- and hypoglycemia, Grave's disease, goiter, Cushing's disease, and Addison's disease, gastrointestinal disorders including ulcerative colitis, gastric and duodenal ulcers, other conditions associated with abnormal vesicle trafficking, including acquired immunodeficiency syndrome (AIDS), allergies including hay fever, asthma, and urticaria (hives), autoimmune hemolytic anemia, proliferative glomerulonephritis, inflammatory bowel disease, multiple sclerosis, myasthenia gravis, rheumatoid and osteoarthritis, scleroderma, Chediak-Higashi and Sjogren's syndromes, systemic lupus erythematosus, toxic shock syndrome, and traumatic tissue damage.

[0258] In another embodiment, a vector capable of expressing INTSIG or a fragment or derivative thereof may be administered to a subject to treat or prevent a disorder associated with decreased expression or activity of INTSIG including, but not limited to, those described above.

[0259] In a further embodiment, a composition comprising a substantially purified INTSIG in conjunction with a suitable pharmaceutical carrier may be administered to a subject to treat or prevent a disorder associated with decreased expression or activity of INTSIG including, but not limited to, those provided above.

[0260] In still another embodiment, an agonist which modulates the activity of INTSIG may be administered to a subject to treat or prevent a disorder associated with decreased expression or activity of INTSIG including, but not limited to, those listed above.

[0261] In a further embodiment, an antagonist of INTSIG may be administered to a subject to treat or prevent a disorder associated with increased expression or activity of INTSIG. Examples of such disorders include, but are not limited to, those cell proliferative, autoimmune/inflammatory, neurological, gastrointestinal, reproductive, cardiovascular, developmental, and vesicle trafficking disorders described above. In one aspect, an antibody which specifically binds INTSIG may be used directly as an antagonist or indirectly as a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissues which express INTSIG.

[0262] In an additional embodiment, a vector expressing the complement of the polynucleotide encoding INTSIG may be administered to a subject to treat or prevent a disorder associated with increased expression or activity of INTSIG including, but not limited to, those described above.

[0263] In other embodiments, any protein, agonist, antagonist, antibody, complementary sequence, or vector embodiments may be administered in combination with other appropriate therapeutic agents. Selection of the appropriate agents for use in combination therapy may be made by one of ordinary skill in the art, according to conventional pharmaceutical principles. The combination of therapeutic agents may act synergistically to effect the treatment or prevention of the various disorders described above. Using this approach, one may be able to achieve therapeutic efficacy with lower dosages of each agent, thus reducing the potential for adverse side effects.

[0264] An antagonist of INTSIG may be produced using methods which are generally known in the art. In particular, purified INTSIG may be used to produce antibodies or to screen libraries of pharmaceutical agents to identify those which specifically bind INTSIG. Antibodies to INTSIG may also be generated using methods that are well known in the art Such antibodies may include, but are not limited to, polyclonal, monoclonal, chimeric, and single chain antibodies, Fab fragments, and fragments produced by a Fab expression library. Neutralizing antibodies (i.e., those which inhibit dimer formation) are generally preferred for therapeutic use. Single chain antibodies (e.g., from camels or llamas) may be potent enzyme inhibitors and may have advantages in the design of peptide mimetics, and in the development of immuno-adsorbents and biosensors (Muyldermans, S. (2001) J. Biotechnol. 74:277-302).

[0265] For the production of antibodies, various hosts including goats, rabbits, rats, mice, camels, dromedaries, Ilamas, humans, and others may be immunized by injection with INTSIG or with any fragment or oligopeptide thereof which has immunogenic properties. Depending on the host species, various adjuvants may be used to increase immunological response. Such adjuvants include, but are not limited to, Freund's, mineral gels such as aluminum hydroxide, and surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, KLH, and dinitrophenol. Among adjuvants used in humans, BCG (bacilli Calmette-Guerin) and Corynebacterium parvum are especially preferable.

[0266] It is preferred that the oligopeptides, peptides, or fragments used to induce antibodies to INTSIG have an amino acid sequence consisting of at least about 5 amino acids, and generally will consist of at least about 10 amino acids. It is also preferable that these oligopeptides, peptides, or fragments are identical to a portion of the amino acid sequence of the natural protein. Short stretches of INTSIG amino acids may be fused with those of another protein, such as KLH, and antibodies to the chimeric molecule may be produced.

[0267] Monoclonal antibodies to INTSIG may be prepared using any technique which provides for the production of antibody molecules by continuous cell lines in culture. These include, but are not limited to, the hybridoma technique, the human B-cell hybridoma technique, and the EBV-hybridoma technique (Kohler, G. et al. (1975) Nature 256:495-497; Kozbor, D. et at (1985) J. Immunol. Methods 81:31-42; Cote, R. J. et al. (1983) Proc. Natl. Acad. Sci. USA 80:2026-2030; Cole, S. P. et al. (1984) Mol. Cell Biol. 62:109-120).

[0268] In addition, techniques developed for the production of “chimeric antibodies,” such as the splicing of mouse antibody genes to human antibody genes to obtain a molecule with appropriate antigen specificity and biological activity, can be used (Morrison, S. L. et al. (1984) Proc. Natl. Acad. Sci. USA 81:6851-6855; Neuberger, M. S. et al. (1984) Nature 312:604-608; Takeda, S. et al. (1985) Nature 314:452-454). Alternatively, techniques described for the production of single chain antibodies may be adapted, using methods known in the art, to produce INTSIG-specific single chain antibodies.

[0269] Antibodies with related specificity, but of distinct idiotypic composition, may be generated by chain shuffling from random combinatorial immunoglobulin libraries (Burton, D. R. (1991) Proc. Natl. Acad. Sci. USA 88:10134-10137).

[0270] Antibodies may also be produced by inducing in vivo production in the lymphocyte population or by screening immunoglobulin libraries or panels of highly specific binding reagents as disclosed in the literature (Orlandi, R. et al. (1989) Proc. Natl. Acad. Sci. USA 86:3833-3837; Winter, G. et al. (1991) Nature 349:293-299).

[0271] Antibody fragments which contain specific binding sites for INTSIG may also be generated. For example, such fragments include, but are not limited to, F(ab)₂ fragments produced by pepsin digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of the F(ab)₂ fragments. Alternatively, Fab expression libraries may be constructed to allow rapid and easy identification of monoclonal Fab fragments with the desired specificity (Huse, W. D. et al (1989) Science 246:1275-1281).

[0272] Various immunoassays may be used for screening to identify antibodies having the desired specificity. Numerous protocols for competitive binding or immunoradiometic assays using either polyclonal or monoclonal antibodies with established specificities are well known in the art. Such immunoassays typically involve the measurement of complex formation between INTSIG and its specific antibody. A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to two non-interfering INTSIG epitopes is generally used, but a competitive binding assay may also be employed (Pound, supra).

[0273] Various methods such as Scatchard analysis in conjunction with radioimmunoassay techniques may be used to assess the affinity of antibodies for INTSIG. Affinity is expressed as an association constant, K_(a), which is defined as the molar concentration of INTSIG-antibody complex divided by the molar concentrations of free antigen and free antibody under equilibrium conditions. The K_(a) determined for a preparation of polyclonal antibodies, which are heterogeneous in their affinities for multiple INTSIG epitopes, represents the average affinity, or avidity, of the antibodies for INTSIG. The K_(a) determined for a preparation of monoclonal antibodies, which are monospecific for a particular INTSIG epitope, represents a true measure of affinity. High-affinity antibody preparations with K_(a) ranging from about 109 to 1012 L/mole are preferred for use in immunoassays in which the INTSIG-antibody complex must withstand rigorous manipulations. Low-affinity antibody preparations with K_(a) ranging from about 106 to 107 L/mole are preferred for use in immunopurification and similar procedures which ultimately require dissociation of INTSIG, preferably in active form, from the antibody (Catty, D. (1988) Antibodies, Volume I: A Practical Approach, IRL Press, Washington DC; Liddell, J. E. and A. Cryer (1991) A Practical Guide to Monoclonal Antibodies, John Wiley & Sons, New York N.Y.).

[0274] The titer and avidity of polyclonal antibody preparations may be further evaluated to determine the quality and suitability of such preparations for certain downstream applications. For example, a polyclonal antibody preparation containing at least 1-2 mg specific antibody/ml, preferably 5-10 mg specific antibody/ml, is generally employed in procedures requiring precipitation of INTSIG-antibody complexes. Procedures for evaluating antibody specificity, titer, and avidity, and guidelines for antibody quality and usage in various applications, are generally available (Catty, supra; Coligan et al., supra).

[0275] In another embodiment of the invention, polynucleotides encoding INTSIG, or any fragment or complement thereof, may be used for therapeutic purposes. In one aspect, modifications of gene expression can be achieved by designing complementary sequences or antisense molecules (DNA, RNA, PNA, or modified oligonucleotides) to the coding or regulatory regions of the gene encoding INTSIG. Such technology is well known in the art, and antisense oligonucleotides or larger fragments can be designed from various locations along the coding or control regions of sequences encoding INTSIG (Agrawal, S., ed. (1996) Antisense Therapeutics, Humana Press, Totawa N.J.).

[0276] In therapeutic use, any gene delivery system suitable for introduction of the antisense sequences into appropriate target cells can be used. Antisense sequences can be delivered intracellularly in the form of an expression plasmid which, upon transcription, produces a sequence complementary to at least a portion of the cellular sequence encoding the target protein (Slater, J. E. et al. (1998) J. Allergy Clin. Immunol. 102:469-475; Scanlon, K. J. et al. (1995) 9:1288-1296). Antisense sequences can also be introduced intracellularly through the use of viral vectors, such as retrovirus and adeno-associated virus vectors (Miller, A. D. (1990) Blood 76:271; Ausubel et al., supra; Uckert, W. and W. Walther (1994) Pharmacol. Ther. 63:323-347). Other gene delivery mechanisms include liposome-derived systems, artificial viral envelopes, and other systems known in the art (Rossi, J. J. (1995) Br. Med. Bull. 51:217-225; Boado, R. J. et al. (1998) J. Pharm. Sci. 87:1308-1315; Morris, M. C. et al. (1997) Nucleic Acids Res. 25:2730-2736).

[0277] In another embodiment of the invention, polynucleotides encoding INTSIG may be used for somatic or germline gene therapy. Gene therapy may be performed to (i) correct a genetic deficiency (e.g., in the cases of severe combined immunodeficiency (SCID)-X1 disease characterized by X-linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined immunodeficiency syndrome associated with an inherited adenosine deaminase (ADA) deficiency (Blaese, R. M. et al (1995) Science 270:475-480; Bordignon, C. et al (1995) Science 270:470-475), cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-216; Crystal, PG. et al. (1995) Hum. Gene Therapy 6:643-666; Crystal, R. G. et al. (1995) Hum. Gene Therapy 6:667-703), thalassamias, familial hypercholesterolemia, and hemophilia resulting from Factor VIII or Factor IX deficiencies (Crystal, R. G. (1995) Science 270:404-410; Verma, I. M. and N. Somia (1997) Nature 389:239-242)), (ii) express a conditionally lethal gene product (e.g., in the case of cancers which result from unregulated cell proliferation), or (iii) express a protein which affords protection against intracellular parasites (e.g., against human retroviruses, such as human immunodeficiency virus (HIV) (Baltimore, D. (1988) Nature 335:395-396; Poeschla, E. et al. (1996) Proc. Natl. Acad. Sci. USA 93:11395-11399), hepatitis B or C virus (HBV, HCV); fungal parasites, such as Candida albicans and Paracoccidioides brasiliensis; and protozoan parasites such as Plasmodium falciparwn and Trypanosoma cruzi). In the case where a genetic deficiency in INTSIG expression or regulation causes disease, the expression of INTSIG from an appropriate population of transduced cells may alleviate the clinical manifestations caused by the genetic deficiency.

[0278] In a further embodiment of the invention, diseases or disorders caused by deficiencies in INTSIG are treated by constructing mammalian expression vectors encoding INTSIG and introducing these vectors by mechanical means into INTSIG-deficient cells. Mechanical transfer technologies for use with cells in vivo or ex vitro include (i) direct DNA microinjection into individual cells, (ii) ballistic gold particle delivery, (iii) liposome-mediated transfection, (iv) receptor-mediated gene transfer, and (v) the use of DNA transposons (Morgan, R. A. and W. F. Anderson (1993) Annu. Rev. Biochem. 62:191-217; Ivics, Z. (1997) Cell 91:501-510; Boulay, J.-L. and H. Récipon (1998) Curr. Opin. Biotechnol 9:445-450).

[0279] Expression vectors that may be effective for the expression of INTSIG include, but are not limited to, the PCDNA 3.1, EPITAG, PRCCMV2, PREP, PVAX, PCR2-TOPOTA vectors (Invitrogen, Carlsbad Calif.), PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla Calif.), and PTET-OFF, PTET-ON, PTRE2, PR2-LUC, PTK-HYG (Clontech, Palo Alto Calif.). INTSIG may be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), Rous sarcoma virus (RSV), SV40 virus, thymidine kinase (TK), or (β-actin genes), (ii) an inducible promoter (e.g., the tetracycline-regulated promoter (Gossen, M. and H. Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547-5551; Gossen, M. et al. (1995) Science 268:1766-1769; Rossi, F. M. V. and H. M. Blau (1998) Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REX plasmid vitrogen)); the ecdysone-inducible promoter (available in the plasmids PVGRXR and PIND; Invitrogen); the FK506/rapamycin inducible promoter; or the RU486/mifepristone inducible promoter (Rossi, F. M. V. and H. M. Blau, supra)), or (iii) a tissue-specific promoter or the native promoter of the endogenous gene encoding INTSIG from a normal individual.

[0280] Commercially available liposome transformation kits (e.g., the PERFECT LIPID TRANSFECTION KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver polynucleotides to target cells in culture and require minimal effort to optimize experimental parameters. In the alternative, transformation is performed using the calcium phosphate method (Graham, P. L. and A. J. Eb (1973) Virology 52:456-467), or by electroporation (Neumann, E. et al. (1982) EMBO J. 1:841-845). The introduction of DNA to primary cells requires modification of these standardized mammalian transfection protocols.

[0281] In another embodiment of the invention, diseases or disorders caused by genetic defects with respect to INTSIG expression are treated by constructing a retrovirus vector consisting of (i) the polynucleotide encoding INTSIG under the control of an independent promoter or the retrovirus long terminal repeat (LTR) promoter, (ii) appropriate RNA packaging signals, and (iii) a Rev-responsive element (RRE) along with additional retrovirus cis-acting RNA sequences and coding sequences required for efficient vector propagation. Retroviris vectors (e.g., PFB and PPBNEO) are commercially available (Stratagene) and are based on published data (Riviere, I. et al. (1995) Proc. Natl. Acad. Sci. USA 92:6733-6737), incorporated by reference herein. The vector is propagated in an appropriate vector producing cell line (VPCL) that expresses an envelope gene with a tropism for receptors on the target cells or a promiscuous envelope protein such as VSVg (Armentano, D. et al. (1987) J. Virol 61:1647-1650; Bender, M. A. et al. (1987) J. Virol. 61:1639-1646; Adam, M. A. and A. D. Miller (1988) J. Virol. 62:3802-3806; Dull, T. et al (1998) J. Virol. 72:8463-8471; Zufferey, R. et al. (1998) J. Virol. 72:9873-9880). U.S. Pat. No. 5,910,434 to Rigg (“Method for obtaining retrovirus packaging cell lines producing high transducing efficiency retroviral supernatant”) discloses a method for obtaining retrovirus packaging cell lines and is hereby incorporated by reference.

[0282] Propagation of retrovirus vectors, transduction of a population of cells (e.g., CD4⁺ T-cells), and the return of transduced cells to a patient are procedures well known to persons skilled in the art of gene therapy and have been well documented (Ranga, U. et al. (1997) J. Virol. 71:7020-7029; Bauer, G. et al. (1997) Blood 89:2259-2267; Bonyhadi, M. L. (1997) J. Virol 71:4707-4716; Ranga, U. et al. (1998) Proc. Natl. Acad. Sci. USA 95:1201-1206; Su, L. (1997) Blood 89:2283-2290).

[0283] In an embodiment, an adenovirus-based gene therapy delivery system is used to deliver polynucleotides encoding INTSIG to cells which have one or more genetic abnormalities with respect to the expression of INTSIG. The construction and packaging of adenovirus-based vectors are well known to those with ordinary skill in the art. Replication defective adenovirus vectors have proven to be versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas (Csete, M. E. et al. (1995) Transplantation 27:263-268). Potentially useful adenoviral vectors are described in U.S. Pat. No. 5,707,618 to Armentano (“Adenovirus vectors for gene therapy”), hereby incorporated by reference. For adenoviral vectors, see also Antinozzi, P. A. et al. (1999; Annu. Rev. Nutr. 19:511-544) and Verma, I. M. and N. Somia (1997; Nature 18:389:239-242).

[0284] In another embodiment, a herpes-based, gene therapy delivery system is used to deliver polynucleotides encoding INTSIG to target cells which have one or more genetic abnormalities with respect to the expression of INTSIG. The use of herpes simplex virus (HSV)-based vectors may be especially valuable for introducing INTSIG to cells of the central nervous system, for which HSV has a tropism. The construction and packaging of herpes-based vectors are well known to those with ordinary skill in the art. A replication-competent herpes simplex virus (HSV) type 1-based vector has been used to deliver a reporter gene to the eyes of primates (Liu, X. et al. (1999) Exp. Eye Res. 169:385-395). The construction of a HSV-1 virus vector has also been disclosed in detail in U.S. Pat. No. 5,804,413 to DeLuca (“Herpes simplex virus strains for gene transfer”), which is hereby incorporated by reference. U.S. Pat. No. 5,804,413 teaches the use of recombinant HSV d92 which consists of a genome containing at least one exogenous gene to be transferred to a cell under the control of the appropriate promoter for purposes including human gene therapy. Also taught by this patent are the construction and use of recombinant HSV strains deleted for ICP4, ICP27 and ICP22. For HSV vectors, see also Goins, W. F. et al. (1999; J. Virol. 73:519-532) and Xu, H. et al. (1994; Dev. Biol. 163:152-161). The manipulation of cloned herpesviris sequences, the generation of recombinant virus following the transfection of multiple plasmids containing different segments of the large herpesvirus genomes, the growth and propagation of herpesvirus, and the infection of cells with herpesvirus are techniques well known to those of ordinary skill in the art.

[0285] In another embodiment, an alphavirus (positive, single-stranded RNA virus) vector is used to deliver polynucleotides encoding INTSIG to target cells. The biology of the prototypic alphavirus, Semliki Forest Virus (SFV), has been studied extensively and gene transfer vectors have been based on the SFV genome (Garoff, H. and K.-J. Li (1998) Curr. Opin. Biotechnol. 9:464-469). During alphavirus RNA replication, a subgenomic RNA is generated that normally encodes the viral capsid proteins. This subgenomic RNA replicates to higher levels than the full length genomic RNA, resulting in the overproduction of capsid proteins relative to the viral proteins with enzymatic activity (e.g., protease and polymerase). Similarly, inserting the coding sequence for INTSIG into the alphavirus genome in place of the capsid-coding region results in the production of a large number of INTSIG-coding RNAs and the synthesis of high levels of INTSIG in vector transduced cells. While alphavirus infection is typically associated with cell lysis within a few days, the ability to establish a persistent infection in hamster normal kidney cells (BHK-21) with a variant of Sindbis virus (SIN) indicates that the lytic replication of alphaviruses can be altered to suit the needs of the gene therapy application (Dryga, S. A. et al. (1997) Virology 228:74-83). The wide host range of alphaviruses will allow the introduction of INTSIG into a variety of cell types. The specific transduction of a subset of cells in a population may require the sorting of cells prior to transduction. The methods of manipulating infectious cDNA clones of alphaviruses, performing alphavirus cDNA and RNA transfections, and performing alphavirus infections, are well known to those with ordinary skill in the art.

[0286] Oligonucleotides derived from the transcription initiation site, e.g., between about positions −10 and +10 from the start site, may also be employed to inhibit gene expression. Similarly, inhibition can be achieved using triple helix base-pairing methodology. Triple helix pairing is useful because it causes inhibition of the ability of the double helix to open sufficiently for the binding of polymerases, transcription factors, or regulatory molecules. Recent therapeutic advances using triplex DNA have been described in the literature (Gee, J. E. et al (1994) in Huber, B. E. and B. I. Carr, Molecular and Immunologic Approaches, Futura Publishing, Mt. Kisco N.Y., pp. 163-177). A complementary sequence or antisense molecule may also be designed to block translation of mRNA by preventing the transcript from binding to ribosomes.

[0287] Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme molecule to complementary target RNA, followed by endonucleolytic cleavage. For example, engineered hammerhead motif ribozyme molecules may specifically and efficiently catalyze endonucleolytic cleavage of RNA molecules encoding INTSIG.

[0288] Specific ribozyme cleavage sites within any potential RNA target are initially identified by scanning the target molecule for ribozyme cleavage sites, including the following sequences: GUA, GUU, and GUC. Once identified, short RNA sequences of between 15 and 20 ribonucleotides, corresponding to the region of the target gene containing the cleavage site, may be evaluated for secondary structural features which may render the oligonucleotide inoperable. The suitability of candidate targets may also be evaluated by testing accessibility to hybridization with complementary oligonucleotides using ribonuclease protection assays.

[0289] Complementary ribonucleic acid molecules and ribozymes may be prepared by any method known in the art for the synthesis of nucleic acid molecules. These include techniques for chemically synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis. Alternatively, RNA molecules may be generated by in vitro and in vivo transcription of DNA molecules encoding INTSIG. Such DNA sequences may be incorporated into a wide variety of vectors with suitable RNA polymerase promoters such as T7 or SP6. Alternatively, these cDNA constructs that synthesize complementary RNA, constitutively or inducibly, can be introduced into cell lines, cells, or tissues.

[0290] RNA molecules may be modified to increase intracellular stability and half-life. Possible modifications include, but are not limited to, the addition of flanking sequences at the 5′ and/or 3′ends of the molecule, or the use of phosphorothioate or 2′ O-methyl rather than phosphodiesterase linkages within the backbone of the molecule. This concept is inherent in the production of PNAs and can be extended in all of these molecules by the inclusion of nontraditional bases such as inosine, queosine, and wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, cytidine, guanine, thymine, and uridine which are not as easily recognized by endogenous endonucleases.

[0291] An additional embodiment of the invention encompasses a method for screening for a compound which is effective in altering expression of a polynucleotide encoding INTSIG. Compounds which may be effective in altering expression of a specific polynucleotide may include, but are not limited to, oligonucleotides, antisense oligonucleotides, triple helix-forming oligonucleotides, transcription factors and other polypeptide transcriptional regulators, and non-macromolecular chemical entities which are capable of interacting with specific polynucleotide sequences. Effective compounds may alter polynucleotide expression by acting as either inhibitors or promoters of polynucleotide expression. Thus, in the treatment of disorders associated with increased INTSIG expression or activity, a compound which specifically inhibits expression of the polynucleotide encoding INTSIG may be therapeutically useful, and in the treatment of disorders associated with decreased INTSIG expression or activity, a compound which specifically promotes expression of the polynucleotide encoding INTSIG may be therapeutically useful.

[0292] At least one, and up to a plurality, of test compounds may be screened for effectiveness in altering expression of a specific polynucleotide. A test compound may be obtained by any method commonly known in the art, including chemical modification of a compound known to be effective in altering polynucleotide expression; selection from an existing, commercially-available or proprietary library of naturally-occurring or non-natural chemical compounds; rational design of a compound based on chemical and/or structural properties of the target polynucleotide; and selection from a library of chemical compounds created combinatorially or randomly. A sample comprising a polynucleotide encoding INTSIG is exposed to at least one test compound thus obtained. The sample may comprise, for example, an intact or permeabilized cell or an in vitro cell-free or reconstituted biochemical system. Alterations in the expression of a polynucleotide encoding INTSIG are assayed by any method commonly known in the art Typically, the expression of a specific nucleotide is detected by hybridization with a probe having a nucleotide sequence complementary to the sequence of the polynucleotide encoding INTSIG. The amount of hybridization may be quantified, thus forming the basis for a comparison of the expression of the polynucleotide both with and without exposure to one or more test compounds. Detection of a change in the expression of a polynucleotide exposed to a test compound indicates that the test compound is effective in altering the expression of the polynucleotide. A screen for a compound effective in altering expression of a specific polynucleotide can be carded out, for example, using a Schizosaccharomyces pombe gene expression system (Atkins, D. et al. (1999) U.S. Pat. No. 5,932,435; Arndt, G. M. et al. (2000) Nucleic Acids Res. 28:E15) or a human cell line such as HeLa cell (Clarke, M. L. et al. (2000) Biochem. Biophys. Res. Commun. 268:8-13). A particular embodiment of the present invention involves screening a combinatorial library of oligonucleotides (such as deoxyribonucleotides, ribonucleotides, peptide nucleic acids, and modified oligonucleotides) for antisense activity against a specific polynucleotide sequence (Bruice, T. W. et al. (1997) U.S. Pat. No. 5,686,242; Bruice, T. W. et al. (2000) U.S. Pat. No. 6,022,691).

[0293] Many methods for introducing vectors into cells or tissues are available and equally suitable for use in vivo, in vitro, and ex vivo. For ex vivo therapy, vectors may be introduced into stem cells taken from the patient and clonally propagated for autologous transplant back into that same patient. Delivery by transfection, by liposome injections, or by polycationic amino polymers may be achieved using methods which are well known in the art (Goldman, C. K. et at (1997) Nat. Biotechnol. 15:462-466).

[0294] Any of the therapeutic methods described above may be applied to any subject in need of such therapy, including, for example, mammals such as humans, dogs, cats, cows, horses, rabbits, and monkeys.

[0295] An additional embodiment of the invention relates to the administration of a composition which generally comprises an active ingredient formulated with a pharmaceutically acceptable excipient Excipients may include, for example, sugars, starches, celluloses, gums, and proteins. Various formulations are commonly known and are thoroughly discussed in the latest edition of Remington's Pharmaceutical Sciences (Maack Publishing, Easton Pa.). Such compositions may consist of INTSIG, antibodies to INTSIG, and mimetics, agonists, antagonists, or inhibitors of INTSIG.

[0296] The compositions utilized in this invention may be administered by any number of routes including, but not limited to, oral, intravenous, intramuscular, intra-arterial, intramedullary, intrathecal, intraventricular, pulmonary, transdermal, subcutaneous, intraperitoneal, intranasal, enteral, topical, sublingual, or rectal means.

[0297] Compositions for pulmonary administration may be prepared in liquid or dry powder form These compositions are generally aerosolized immediately prior to inhalation by the patient. In the case of small molecules (e.g. traditional low molecular weight organic drugs), aerosol delivery of fast-acting formulations is well-known in the art. In the case of macromolecules (e.g. larger peptides and proteins), recent developments in the field of pulmonary delivery via the alveolar region of the lung have enabled the practical delivery of drugs such as insulin to blood circulation (see, e.g., Patton, J. S. et al., U.S. Pat. No. 5,997,848). Pulmonary delivery has the advantage of administration without needle injection, and obviates the need for potentially toxic penetration enhancers.

[0298] Compositions suitable for use in the invention include compositions wherein the active ingredients are contained in an effective amount to achieve the intended purpose. The determination of an effective dose is well within the capability of those skilled in the art.

[0299] Specialized forms of compositions may be prepared for direct intracellular delivery of macromolecules comprising INTSIG or fragments thereof. For example, liposome preparations containing a cell-impermeable macromolecule may promote cell fusion and intracellular delivery of the macromolecule. Alternatively, INTSIG or a fragment thereof may be joined to a short cationic N-terminal portion from the HIV Tat-1 protein. Fusion proteins thus generated have been found to transduce into the cells of all tissues, including the brain, in a mouse model system (Schwarze, S. R. et al. (1999) Science 285:1569-1572).

[0300] For any compound, the therapeutically effective dose can be estimated initially either in cell culture assays, e.g., of neoplastic cells, or in animal models such as mice, rats, rabbits, dogs, monkeys, or pigs. An animal model may also be used to determine the appropriate concentration range and route of administration. Such information can then be used to determine useful doses and routes for administration in humans.

[0301] A therapeutically effective dose refers to that amount of active ingredient, for example INTSIG or fragments thereof, antibodies of INTSIG, and agonists, antagonists or inhibitors of SIG, which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be determined by standard pharmaceutical procedures in cell cultures or with experimental animals, such as by calculating the ED₅₀ (the dose therapeutically effective in 50% of the population) or LD₅₀ (the dose lethal to 50% of the population) statistics. The dose ratio of toxic to therapeutic effects is the therapeutic index, which can be expressed as the LD₅₀/ED₅₀ ratio. Compositions which exhibit large therapeutic indices are preferred. The data obtained from cell culture assays and animal studies are used to formulate a range of dosage for human use. The dosage contained in such compositions is preferably within a range of circulating concentrations that includes the ED₅₀ with little or no toxicity. The dosage varies within this range depending upon the dosage form employed, the sensitivity of the patient, and the route of administration.

[0302] The exact dosage will be determined by the practitioner, in light of factors related to the subject requiring treatment. Dosage and administration are adjusted to provide sufficient levels of the active moiety or to maintain the desired effect. Factors which may be taken into account include the severity of the disease state, the general health of the subject, the age, weight, and gender of the subject, time and frequency of administration, drug combination(s), reaction sensitivities, and response to therapy. Long-acting compositions may be administered every 3 to 4 days, every week, or biweekly depending on the half-life and clearance rate of the particular formulation.

[0303] Normal dosage amounts may vary from about 0.1 μg to 100,000 μg, up to a total dose of about 1 gram, depending upon the route of administration. Guidance as to particular dosages and methods of delivery is provided in the literature and generally available to practitioners in the art. Those skilled in the art will employ different formulations for nucleotides than for proteins or their inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular cells, conditions, locations, etc.

[0304] Diagnostics

[0305] In another embodiment, antibodies which specifically bind INTSIG may be used for the diagnosis of disorders characterized by expression of INTSIG, or in assays to monitor patients being treated with INTSIG or agonists, antagonists, or inhibitors of INTSIG. Antibodies useful for diagnostic purposes may be prepared in the same manner as described above for therapeutics. Diagnostic assays for INTSIG include methods which utilize the antibody and a label to detect INTSIG in human body fluids or in extracts of cells or tissues. The antibodies may be used with or without modification, and may be labeled by covalent or non-covalent attachment of a reporter molecule. A wide variety of reporter molecules, several of which are described above, are known in the art and may be used. A variety of protocols for measuring INTSIG, including ELISAs, RIAs, and FACS, are known in the art and provide a basis for diagnosing altered or abnormal levels of INTSIG expression. Normal or standard values for INTSIG expression are established by combining body fluids or cell extracts taken from normal mammalian subjects, for example, human subjects, with antibodies to INTSIG under conditions suitable for complex formation. The amount of standard complex formation may be quantitated by various methods, such as photometric means. Quantities of INTSIG expressed in subject, control, and disease samples from biopsied tissues are compared with the standard values. Deviation between standard and subject values establishes the parameters for diagnosing disease.

[0306] In another embodiment of the invention, polynucleotides encoding INTSIG may be used for diagnostic purposes. The polynucleotides which may be used include oligonucleotides, complementary RNA and DNA molecules, and PNAs. The polynucleotides may be used to detect and quantify gene expression in biopsied tissues in which expression of INTSIG may be correlated with disease. The diagnostic assay may be used to determine absence, presence, and excess expression of INTSIG, and to monitor regulation of INTSIG levels during therapeutic intervention.

[0307] In one aspect, hybridization with PCR probes which are capable of detecting polynucleotides, including genomic sequences, encoding INTSIG or closely related molecules may be used to identify nucleic acid sequences which encode INTSIG. The specificity of the probe, whether it is made from a highly specific region, e.g., the 5′regulatory region, or from a less specific region, e.g., a conserved motif, and the stringency of the hybridization or amplification will determine whether the probe identifies only naturally occurring sequences encoding INTSIG, allelic variants, or related sequences.

[0308] Probes may also be used for the detection of related sequences, and may have at least 50% sequence identity to any of the INTSIG encoding sequences. The hybridization probes of the subject invention may be DNA or RNA and may be derived from the sequence of SEQ ID NO:24-46 or from genomic sequences including promoters, enhancers, and introns of the INTSIG gene.

[0309] Means for producing specific hybridization probes for polynucleotides encoding INTSIG include the cloning of polynucleotides encoding INTSIG or INTSIG derivatives into vectors for the production of mRNA probes. Such vectors are known in the art, are commercially available, and may be used to synthesize RNA probes in vitro by means of the addition of the appropriate RNA polymerases and the appropriate labeled nucleotides. Hybridization probes may be labeled by a variety of reporter groups, for example, by radionuclides such as ³²P or ³⁵S, or by enzymatic labels, such as alkaline phosphatase coupled to the probe via avidin/biotin coupling systems, and the like.

[0310] Polynucleotides encoding INTSIG may be used for the diagnosis of disorders associated with expression of INTSIG. Examples of such disorders include, but are not limited to, a cell proliferative disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands, skin, spleen, testis, thymus, thyroid, and uterus; an autoimmune/inflammatory disorder such as acquired immunodeficiency syndrome (AIDS), Addison's disease, adult respiratory distress syndrome, allergies, ankylosing spondylitis, amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED), bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes mellitus, emphysema, episodic lymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythema nodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome, gout, Graves' disease, Hashimoto's thyroiditis, hypereosinophilia, irritable bowel syndrome, multiple sclerosis, myasthenia gravis, myocardial or pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjögren's syndrome, systemic anaphylaxis, systemic lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, ulcerative colitis, uveitis, Werner syndrome, complications of cancer, hemodialysis, and extracorporeal circulation, viral, bacterial, fungal, parasitic, protozoal, and helminthic infections, and trauma; a neurological disorder such as epilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease, Pick's disease, Huntington's disease, dementia, Parkinson's disease and other extrapyramidal disorders, amyotrophic lateral sclerosis and other motor neuron disorders, progressive neural muscular atrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinating diseases, bacterial and viral meningitis, brain abscess, subdural empyema, epidural abscess, suppurative intracranial thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion diseases including kuru, Creutzfeldt-Jakob disease, and Gerstmann-Straussler-Scheinker syndrome, fatal familial insomnia, nutritional and metabolic diseases of the nervous system, neurofibromatosis, tuberous sclerosis, cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental retardation and other developmental disorders of the central nervous system including Down syndrome, cerebral palsy, neuroskeletal disorders, autonomic nervous system disorders, cranial nerve disorders, spinal cord diseases, muscular dystrophy and other neuromuscular disorders, peripheral nervous system disorders, dermatomyositis and polymyositis, inherited, metabolic, endocrine, and toxic myopathies, myasthenia gravis, periodic paralysis, mental disorders including mood, anxiety, and schizophrenic disorders, seasonal affective disorder (SAD), akathesia, amnesia, catatonia, diabetic neuropathy, tardive dyskinesia, dystonias, paranoid psychoses, postherpetic neuralgia, Tourette's disorder, progressive supranuclear palsy, corticobasal degeneration, and familial frontotemporal dementia; a gastrointestinal disorder such as dysphagia, peptic esophagitis, esophageal spasm, esophageal stricture, esophageal carcinoma, dyspepsia, indigestion, gastritis, gastric carcinoma, anorexia, nausea, emesis, gastroparesis, antral or pyloric edema, abdominal angina, pyrosis, gastroenteritis, intestinal obstruction, infections of the intestinal tract, peptic ulcer, cholelithiasis, cholecystitis, cholestasis, pancreatitis, pancreatic carcinoma, biliary tract disease, hepatitis, hyperbilirubinemia, cirrhosis, passive congestion of the liver, hepatoma, infectious colitis, ulcerative colitis, ulcerative proctitis, Crohn's disease, Whipple's disease, Mallory-Weiss syndrome, colonic carcinoma, colonic obstruction, irritable bowel syndrome, short bowel syndrome, diarrhea, constipation, gastrointestnal hemorrhage, acquired immunodeficiency syndrome (AIDS) enteropathy, jaundice, hepatic encephalopathy, hepatorenal syndrome, hepatic steatosis, hemochromatosis, Wilson's disease, alpha₁-antitrypsin deficiency, Reye's syndrome, primary sclerosing cholangitis, liver infarction, portal vein obstruction and thrombosis, centrilobular necrosis, peliosis hepatis, hepatic vein thrombosis, veno-occlusive disease, preeclampsia, eclampsia, acute fatty liver of pregnancy, intrahepatic cholestasis of pregnancy, and hepatic tumors including nodular hyperplasias, adenomas, and carcinomas; a reproductive disorder such as a disorder of prolactin production, infertility, including tubal disease, ovulatory defects, endometriosis, a disruption of the estrous cycle, a disruption of the menstrual cycle, polycystic ovary syndrome, ovarian hyperstimulation syndrome, an endometrial or ovarian tumor, a uterine fibroid, autoimmune disorders, ectopic pregnancy, teratogenesis, cancer of the breast, fibrocystic breast disease, galactorrhea, a disruption of spermatogenesis, abnormal sperm physiology, cancer of the testis, cancer of the prostate, benign prostatic hyperplasia, prostatitis, Peyronie's disease, impotence, carcinoma of the male breast, gynecomastia, hypergonadotropic and hypogonadotropic hypogonadism, pseudohermaphroditism, azoospermia, premature ovarian failure, acrosin deficiency, delayed puperty, retrograde ejaculation and anejaculation, haemangioblastomas, cystsphaeochromocytomas, paraganglioma, cystadenomas of the epididymis, and endolymphatic sac tumours; a cardiovascular disorder such as arteriovenous fistula, atherosclerosis, hypertension, vasculitis, Raynaud's disease, aneurysms, arterial dissections, varicose veins, thrombophlebitis and phlebothrombosis, vascular tumors, and complications of thrombolysis, balloon angioplasty, vascular replacement, and coronary artery bypass graft surgery, congestive heart failure, ischemic heart disease, angina pectoris, myocardial infarction, hypertensive heart disease, degenerative valvular heart disease, calcific aortic valve stenosis, congenitally bicuspid aortic valve, mitral annular calcification, mitral valve prolapse, rheumatic fever and rheumatic heart disease, infective endocarditis, nonbacterial thrombotic endocarditis, endocarditis of systemic lupus erythematosus, carcinoid heart disease, cardiomyopathy, myocarditis, pericarditis, neoplastic heart disease, congenital heart disease, and complications of cardiac transplantation; a developmental disorder such as renal tubular acidosis, anemia, Cushing's syndrome, achondroplastic dwarfism, Duchenne and Becker muscular dystrophy, epilepsy, gonadal dysgenesis, WAGR syndrome (Wilms' tumor, aniridia, genitourinary abnormalities, and mental retardation), Smith-Magenis syndrome, myelodysplastic syndrome, hereditary mucoepithelial dysplasia, hereditary keratodermas, hereditary neuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis, hypothyroidism, hydrocephalus, seizure disorders such as Syndenham's chorea and cerebral palsy, spina bifida, anencephaly, craniorachischisis, congenital glaucoma, cataract, and sensorineural hearing loss; and a vesicle trafficking disorder such as cystic fibrosis, glucose-galactose malabsorption syndrome, hypercholesterolemia, diabetes mellitus, diabetes insipidus, hyper- and hypoglycemia, Grave's disease, goiter, Cushing's disease, and Addison's disease, gastrointestinal disorders including ulcerative colitis, gastric and duodenal ulcers, other conditions associated with abnormal vesicle trafficking, including acquired immunodeficiency syndrome (AIDS), allergies including hay fever, asthma, and urticaria (hives), autoimmune hemolytic anemia, proliferative glomerulonephritis, inflammatory bowel disease, multiple sclerosis, myasthenia gravis, rheumatoid and osteoarthritis, scleroderma, Chediak-Higashi and Sjogren's syndromes, systemic lupus erythematosus, toxic shock syndrome, and traumatic tissue damage. Polynucleotides encoding INTSIG may be used in Southern or northern analysis, dot blot, or other membrane-based technologies; in PCR technologies; in dipstick, pin, and multiformat ELISA-like assays; and in microarrays utilizing fluids or tissues from patients to detect altered INTSIG expression. Such qualitative or quantitative methods are well known in the art.

[0311] In a particular aspect, polynucleotides encoding INTSIG may be used in assays that detect the presence of associated disorders, particularly those mentioned above. Polynucleotides complementary to sequences encoding INTSIG may be labeled by standard methods and added to a fluid or tissue sample from a patient under conditions suitable for the formation of hybridization complexes. After a suitable incubation period, the sample is washed and the signal is quantified and compared with a standard value. If the amount of signal in the patient sample is significantly altered in comparison to a control sample then the presence of altered levels of polynucleotides encoding INTSIG in the sample indicates the presence of the associated disorder. Such assays may also be used to evaluate the efficacy of a particular therapeutic treatment regimen in animal studies, in clinical trials, or to monitor the treatment of an individual patient.

[0312] In order to provide a basis for the diagnosis of a disorder associated with expression of INTSIG, a normal or standard profile for expression is established. This may be accomplished by combining body fluids or cell extracts taken from normal subjects, either animal or human, with a sequence, or a fragment thereof, encoding INTSIG, under conditions suitable for hybridization or amplification. Standard hybridization may be quantified by comparing the values obtained from normal subjects with values from an experiment in which a known amount of a substantially purified polynucleotide is used. Standard values obtained in this manner may be compared with values obtained from samples from patients who are symptomatic for a disorder. Deviation from standard values is used to establish the presence of a disorder.

[0313] Once the presence of a disorder is established and a treatment protocol is initiated, hybridization assays may be repeated on a regular basis to determine if the level of expression in the patient begins to approximate that which is observed in the normal subject. The results obtained from successive assays may be used to show the efficacy of treatment over a period ranging from several days to months.

[0314] With respect to cancer, the presence of an abnormal amount of transcript (either under- or overexpressed) in biopsied tissue from an individual may indicate a predisposition for the development of the disease, or may provide a means for detecting the disease prior to the appearance of actual clinical symptoms. A more definitive diagnosis of this type may allow health professionals to employ preventative measures or aggressive treatment earlier, thereby preventing the development or further progression of the cancer.

[0315] Additional diagnostic uses for oligonucleotides designed from the sequences encoding INTSIG may involve the use of PCR. These oligomers may be chemically synthesized, generated enzymatically, or produced in vitro. Oligomers will preferably contain a fragment of a polynucleotide encoding INTSIG, or a fragment of a polynucleotide complementary to the polynucleotide encoding INTSIG, and will be employed under optimized conditions for identification of a specific gene or condition. Oligomers may also be employed under less stringent conditions for detection or quantification of closely related DNA or RNA sequences.

[0316] In a particular aspect, oligonucleotide primers derived from polynucleotides encoding INTSIG may be used to detect single nucleotide polymorphisms (SNPs). SNPs are substitutions, insertions and deletions that are a frequent cause of inherited or acquired genetic disease in humans. Methods of SNP detection include, but are not limited to, single-stranded conformation polymorphism (SSCP) and fluorescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers derived from polynucleotides encoding INTSIG are used to amplify DNA using the polymerase chain reaction (PCR). The DNA may be derived, for example, from diseased or normal tissue, biopsy samples, bodily fluids, and the like. SNPs in the DNA cause differences in the secondary and tertiary structures of PCR products in single-stranded form, and these differences are detectable using gel electrophoresis in non-denaturing gels. In fSCCP, the oligonucleotide primers are fluorescently labeled, which allows detection of the amplimers in high-throughput equipment such as DNA sequencing machines. Additionally, sequence database analysis methods, termed in silico SNP (is SNP), are capable of identifying polymorphisms by comparing the sequence of individual overlapping DNA fragments which assemble into a common consensus sequence. These computer-based methods filter out sequence variations due to laboratory preparation of DNA and sequencing errors using statistical models and automated analyses of DNA sequence chromatograms. In the alternative, SNPs may be detected and characterized by mass spectrometry using, for example, the high throughput MASSARRAY system (Sequenom, Inc., San Diego Calif.).

[0317] SNPs may be used to study the genetic basis of human disease. For example, at least 16 common SNPs have been associated with non-insulin-dependent diabetes mellitus. SNPs are also useful for examining differences in disease outcomes in monogenic disorders, such as cystic fibrosis, sickle cell anemia, or chronic granulomatous disease. For example, variants in the mannose-binding lectin, MBL2, have been shown to be correlated with deleterious pulmonary outcomes in cystic fibrosis. SNPs also have utility in pharmacogenomics, the identification of genetic variants that influence a patient's response to a drug, such as life-threatening toxicity. For example, a variation in N-acetyl transferase is associated with a high incidence of peripheral neuropathy in response to the anti-tuberculosis drug isoniazid, while a variation in the core promoter of the ALOX5 gene results in diminished clinical response to treatment with an anti-asthma drug that targets the 5-lipoxygenase pathway. Analysis of the distribution of SNPs in different populations is useful for investigating genetic drift, mutation, recombination, and selection, as well as for tracing the origins of populations and their migrations (Taylor, J. G. et al. (2001) Trends Mol. Med. 7:507-512; Kwok, P.-Y. and Z. Gu (1999) Mol. Med. Today 5:538-543; Nowotny, P. et al. (2001) Curr. Opin. Neurobiol. 11:637-641).

[0318] Methods which may also be used to quantify the expression of INTSIG include radiolabeling or biotinylating nucleotides, coamplification of a control nucleic acid, and interpolating results from standard curves (Melby, P. C. et al. (1993) J. Immunol. Methods 159:235-244; Duplaa, C. et al. (1993) Anal Biochem. 212:229-236). The speed of quantitation of multiple samples may be accelerated by running the assay in a high-throughput format where the oligomer or polynucleotide of interest is presented in various dilutions and a spectrophotometric or colorimetric response gives rapid quantitation.

[0319] In further embodiments, oligonucleotides or longer fragments derived from any of the polynucleotides described herein may be used as elements on a microarray. The microarray can be used in transcript imaging techniques which monitor the relative expression levels of large numbers of genes simultaneously as described below. The microarray may also be used to identify genetic variants, mutations, and polymorphisms. This information may be used to determine gene function, to understand the genetic basis of a disorder, to diagnose a disorder, to monitor progression/regression of disease as a function of gene expression, and to develop and monitor the activities of therapeutic agents in the treatment of disease. In particular, this information may be used to develop a pharmacogenomic profile of a patient in order to select the most appropriate and effective treatment regimen for that patient. For example, therapeutic agents which are highly effective and display the fewest side effects may be selected for a patient based on his/her pharmacogenomic profile.

[0320] In another embodiment, INTSIG, fragments of INTSIG, or antibodies specific for INTSIG may be used as elements on a microarray. The microarray may be used to monitor or measure protein-protein interactions, drug-target interactions, and gene expression profiles, as described above.

[0321] A particular embodiment relates to the use of the polynucleotides of the present invention to generate a transcript image of a tissue or cell type. A transcript image represents the global pattern of gene expression by a particular tissue or cell type. Global gene expression patterns are analyzed by quantifying the number of expressed genes and their relative abundance under given conditions and at a given time (Seilhamer et al., “Comparative Gene Transcript Analysis,” U.S. Pat. No. 5,840,484; hereby expressly incorporated by reference herein). Thus a transcript image may be generated by hybridizing the polynucleotides of the present invention or their complements to the totality of transcripts or reverse transcripts of a particular tissue or cell type. In one embodiment, the hybridization takes place in high-throughput format, wherein the polynucleotides of the present invention or their complements comprise a subset of a plurality of elements on a microarray. The resultant transcript image would provide a profile of gene activity.

[0322] Transcript images may be generated using transcripts isolated from tissues, cell lines, biopsies, or other biological samples. The transcript image may thus reflect gene expression in vivo, as in the case of a tissue or biopsy sample, or in vitro, as in the case of a cell line.

[0323] Transcript images which profile the expression of the polynucleotides of the present invention may also be used in conjunction with in vitro model systems and preclinical evaluation of pharmaceuticals, as well as toxicological testing of industrial and naturally-occurring environmental compounds. All compounds induce characteristic gene expression patterns, frequently termed molecular fingerprints or toxicant signatures, which are indicative of mechanisms of action and toxicity (Nuwaysir, E. F. et al. (1999) Mol. Carcinog. 24:153-159; Steiner, S. and N. L. Anderson (2000) Toxicol: Lett 112-113:467-471). If a test compound has a signature similar to that of a compound with known toxicity, it is likely to share those toxic properties. These fingerprints or signatures are most useful and refined when they contain expression information from a large number of genes and gene families. Ideally, a genome-wide measurement of expression provides the highest quality signature. Even genes whose expression is not altered by any tested compounds are important as well, as the levels of expression of these genes are used to normalize the rest of the expression data. The normalization procedure is useful for comparison of expression data after treatment with different compounds. While the assignment of gene function to elements of a toxicant signature aids in interpretation of toxicity mechanisms, knowledge of gene function is not necessary for the statistical matching of signatures which leads to prediction of toxicity (see, for example, Press Release 00-02 from the National Institute of Environmental Health Sciences, released Feb. 29, 2000, available at http://www.niehs.nih.gov/oc/news/toxchip.htm). Therefore, it is important and desirable in toxicological screening using toxicant signatures to include all expressed gene sequences.

[0324] In an embodiment, the toxicity of a test compound can be assessed by treating a biological sample containing nucleic acids with the test compound. Nucleic acids that are expressed in the treated biological sample are hybridized with one or more probes specific to the polynucleotides of the present invention, so that transcript levels corresponding to the polynucleotides of the present invention may be quantified. The transcript levels in the treated biological sample are compared with levels in an untreated biological sample. Differences in the transcript levels between the two samples are indicative of a toxic response caused by the test compound in the treated sample.

[0325] Another embodiment relates to the use of the polypeptides disclosed herein to analyze the proteome of a tissue or cell type. The term proteome refers to the global pattern of protein expression in a particular tissue or cell type. Each protein component of a proteome can be subjected individually to further analysis. Proteome expression patterns, or profiles, are analyzed by quantifying the number of expressed proteins and their relative abundance under given conditions and at a given time. A profile of a cell's proteome may thus be generated by separating and analyzing the polypeptides of a particular tissue or cell type. In one embodiment, the separation is achieved using two-dimensional gel electrophoresis, in which proteins from a sample are separated by isoelectric focusing in the first dimension, and then according to molecular weight by sodium dodecyl sulfate slab gel electrophoresis in the second dimension (Steiner and Anderson, supra). The proteins are visualized in the gel as discrete and uniquely positioned spots, typically by staining the gel with an agent such as Coomassie Blue or silver or fluorescent stains. The optical density of each protein spot is generally proportional to the level of the protein in the sample. The optical densities of equivalently positioned protein spots from different samples, for example, from biological samples either treated or untreated with a test compound or therapeutic agent, are compared to identify any changes in protein spot density related to the treatment. The proteins in the spots are partially sequenced using, for example, standard methods employing chemical or enzymatic cleavage followed by mass spectrometry. The identity of the protein in a spot may be determined by comparing its partial sequence, preferably of at least 5 contiguous amino acid residues, to the polypeptide sequences of interest. In some cases, further sequence data may be obtained for definitive protein identification.

[0326] A proteomic profile may also be generated using antibodies specific for INTSIG to quantify the levels of INTSIG expression. In one embodiment, the antibodies are used as elements on a microarray, and protein expression levels are quantified by exposing the microarray to the sample and detecting the levels of protein bound to each array element (Lueking, A. et al. (1999) Anal. Biochem. 270:103-111; Mendoze, L. G. et al. (1999) Biotechniques 27:778-788). Detection may be performed by a variety of methods known in the art, for example, by reacting the proteins in the sample with a thiol- or amino-reactive fluorescent compound and detecting the amount of fluorescence bound at each array element.

[0327] Toxicant signatures at the proteome level are also useful for toxicological screening, and should be analyzed in parallel with toxicant signatures at the transcript level. There is a poor correlation between transcript and protein abundances for some proteins in some tissues (Anderson, N. L. and J. Seilhamer (1997) Electrophoresis 18:533-537), so proteome toxicant signatures may be useful in the analysis of compounds which do not significantly affect the transcript image, but which alter the proteomic profile. In addition, the analysis of transcripts in body fluids is difficult, due to rapid degradation of mRNA, so proteomic profiling may be more reliable and informative in such cases.

[0328] In another embodiment, the toxicity of a test compound is assessed by treating a biological sample containing proteins with the test compound. Proteins that are expressed in the treated biological sample are separated so that the amount of each protein can be quantified. The amount of each protein is compared to the amount of the corresponding protein in an untreated biological sample. A difference in the amount of protein between the two samples is indicative of a toxic response to the test compound in the treated sample. Individual proteins are identified by sequencing the amino acid residues of the individual proteins and comparing these partial sequences to the polypeptides of the present invention.

[0329] In another embodiment, the toxicity of a test compound is assessed by treating a biological sample containing proteins with the test compound. Proteins from the biological sample are incubated with antibodies specific to the polypeptides of the present invention. The amount of protein recognized by the antibodies is quantified. The amount of protein in the treated biological sample is compared with the amount in an untreated biological sample. A difference in the amount of protein between the two samples is indicative of a toxic response to the test compound in the treated sample.

[0330] Microarrays may be prepared, used, and analyzed using methods known in the art (Brennan, T. M. et al (1995) U.S. Pat. No. 5,474,796; Schena, M. et al. (1996) Proc. Natl. Acad. Sci. USA 93:10614-10619; Baldeschweiler et al. (1995) PCT application WO95/251116; Shalon, D. et al. (1995) PCT application WO95/35505; Heller, R. A. et al. (1997) Proc. Natl. Acad. Sci. USA 94:2150-2155;

[0331] Heller, M. J. et al. (1997) U.S. Pat. No. 5,605,662). Various types of microarrays are well known and thoroughly described in Schena, M., ed. (1999; DNA Microarrays: A Practical Approach, Oxford University Press, London).

[0332] In another embodiment of the invention, nucleic acid sequences encoding INTSIG may be used to generate hybridization probes useful in mapping the naturally occurring genomic sequence. Either coding or noncoding sequences may be used, and in some instances, noncoding sequences may be preferable over coding sequences. For example, conservation of a coding sequence among members of a multi-gene family may potentially cause undesired cross hybridization during chromosomal mapping. The sequences may be mapped to a particular chromosome, to a specific region of a chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes (HACs), yeast artificial chromosomes (YACs), bacterial artificial chromosomes (BACs), bacterial P1 constructions, or single chromosome cDNA libraries (Harrington, J. J. et al. (1997) Nat Genet 15:345-355; Price, C. M. (1993) Blood Rev. 7:127-134; Trask, B. J. (1991) Trends Genet 7:149-154). Once mapped, the nucleic acid sequences may be used to develop genetic linkage maps, for example, which correlate the inheritance of a disease state with the inheritance of a particular chromosome region or restriction fragment length polymorphism (RFLP) (Lander, E. S. and D. Botstein (1986) Proc. Natl. Acad. Sci. USA 83:7353-7357).

[0333] Fluorescent in situ hybridization (FISH) may be correlated with other physical and genetic map data (Heinz-Ulrich, et al. (1995) in Meyers, supra, pp. 965-968). Examples of genetic map data can be found in various scientific journals or at the Online Mendelian Inheritance in Man (OMIM) World Wide Web site. Correlation between the location of the gene encoding INTSIG on a physical map and a specific disorder, or a predisposition to a specific disorder, may help define the region of DNA associated with that disorder and thus may further positional cloning efforts.

[0334] In situ hybridization of chromosomal preparations and physical mapping techniques, such as linkage analysis using established chromosomal markers, may be used for extending genetic maps. Often the placement of a gene on the chromosome of another mammalian species, such as mouse, may reveal associated markers even if the exact chromosomal locus is not known. This information is valuable to investigators searching for disease genes using positional cloning or other gene discovery techniques. Once the gene or genes responsible for a disease or syndrome have been crudely localized by genetic linkage to a particular genomic region, e.g., ataxia-telangiectasia to 11q22-23, any sequences mapping to that area may represent associated or regulatory genes for further investigation (Gatti, R. A. et al. (1988) Nature 336:577-580). The nucleotide sequence of the instant invention may also be used to detect differences in the chromosomal location due to translocation, inversion, etc., among normal, carrier, or affected individuals.

[0335] In another embodiment of the invention, INTSIG, its catalytic or immunogenic fragments, or oligopeptides thereof can be used for screening libraries of compounds in any of a variety of drug screening techniques. The fragment employed in such screening may be free in solution, affixed to a solid support, borne on a cell surface, or located intracellularly. The formation of binding complexes between INTSIG and the agent being tested may be measured.

[0336] Another technique for drug screening provides for high throughput screening of compounds having suitable binding affinity to the protein of interest (Geysen, et al (1984) PCT application WO84/03564). In this method, large numbers of different small test compounds are synthesized on a solid substrate. The test compounds are reacted with INTSIG, or fragments thereof, and washed. Bound INTSIG is then detected by methods well known in the arts Purified INTSIG can also be coated directly onto plates for use in the aforementioned drug screening techniques. Alternatively, non-neutralizing antibodies can be used to capture the peptide and immobilize it on a solid support.

[0337] In another embodiment, one may use competitive drug screening assays in which neutralizing antibodies capable of binding INTSIG specifically compete with a test compound for binding INTSIG. In this manner, antibodies can be used to detect the presence of any peptide which shares one or more antigenic determinants with INTSIG.

[0338] In additional embodiments, the nucleotide sequences which encode INTSIG may be used in any molecular biology techniques that have yet to be developed, provided the new techniques rely on properties of nucleotide sequences that are currently known, including, but not limited to, such properties as the triplet genetic code and specific base pair interactions.

[0339] Without further elaboration, it is believed that one skilled in the art can, using the preceding description, utilize the present invention to its fullest extent. The following embodiments are, therefore, to be construed as merely illustrative, and not limitative of the remainder of the disclosure in any way whatsoever.

[0340] The disclosures of all patents, applications, and publications mentioned above and below, including U.S. Ser. No. 60/305,113, U.S. Ser. No. 60/305,367, U.S. Ser. No. 60/306,966, U.S. Ser. No. 60/308,175, U.S. Ser. No. 60/308,327, U.S. Ser. No. 60/309,902, U.S. Ser. No. 60/310,752, and U.S. Ser. No. 60/311,636, are expressly incorporated by reference herein.

EXAMPLES

[0341] I. Construction of cDNA Libraries

[0342] Incyte cDNAs were derived from cDNA libraries described in the LIFESEQ GOLD database (Incyte Genomics, Palo Alto Calif.). Some tissues were homogenized and lysed in guanidinium isothiocyanate, while others were homogenized and lysed in phenol or in a suitable mixture of denaturants, such as TRIZOL nitrogen), a monophasic solution of phenol and guanidine isothiocyanate. The resulting lysates were centrifuged over CsCl cushions or extracted with chloroform. RNA was precipitated from the lysates with either isopropanol or sodium acetate and ethanol, or by other routine methods.

[0343] Phenol extraction and precipitation of RNA were repeated as necessary to increase RNA purity. In some cases, RNA was treated with DNase. For most libraries, poly(A)+ RNA was isolated using oligo d(T)-coupled paramagnetic particles (Promega), OLIGOTEX latex particles (QIAGEN, Chatsworth Calif.), or an OLIGOTEX mRNA purification kit (QIAGEN). Alternatively, RNA was isolated directly from tissue lysates using other RNA isolation kits, e.g., the POLY(A)PURE mRNA purification kit (Ambion, Austin Tex.).

[0344] In some cases, Stratagene was provided with RNA and constructed the corresponding cDNA libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed with the UNIZAP vector system (Stratagene) or SUPERSCRIPT plasmid system (Invitrogen), using the recommended procedures or similar methods known in the art (Ausubel et al., supra, ch 5). Reverse transcription was initiated using oligo d(T) or random primers. Synthetic oligonucleotide adapters were ligated to double stranded cDNA, and the cDNA was digested with the appropriate restriction enzyme or enzymes. For most libraries, the cDNA was size-selected (300-1000 bp) using SEPHACRYL S1000, SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography (Amersham Biosciences) or preparative agarose gel electrophoresis. cDNAs were ligated into compatible restriction enzyme sites of the polylinker of a suitable plasmid, e.g., PBLUESCRIPT plasmid (Stratagene), PSPORT1 plasmid (Invitrogen), PCDNA2.1 plasmid (Invitrogen, Carlsbad Calif.), PBK-CMV plasmid (Stratagene), PCR2-TOPOTA plasmid (Invitrogen), PCMV-ICIS plasmid (Stratagene), pIGEN (Incyte Genomics, Palo Alto Calif.), pRARE (Incyte Genomics), or pINCY (Incyte Genomics), or derivatives thereof. Recombinant plasmids were transformed into competent E. coli cells including XL1-Blue, XL1-BlueMRF, or SOLR from Stratagene or DH5α, DH10B, or ElectroMAX DH10B from Invitrogen.

[0345] II. Isolation of cDNA Clones

[0346] Plasmids obtained as described in Example I were recovered from host cells by int vivo excision using the UNIZAP vector system (Stratagene) or by cell lysis. Plasmids were purified using at least one of the following: a Magic or WIZARD Minipreps DNA purification system (Promega); an AGTC Miniprep purification kit (Edge Biosystems, Gaithersburg Md.); and QIAWELL 8 Plasmid, QIAWELL 8 Plus Plasmid, QIAWELL 8 Ultra Plasmid purification systems or the R.E.A.L. PREP 96 plasmid purification kit from QIAGEN. Following precipitation, plasmids were resuspended in 0.1 ml of distilled water and stored, with or without lyophilization, at 4° C.

[0347] Alternatively, plasmid DNA was amplified from host cell lysates using direct link PCR in a high-throughput format (Rao, V. B. (1994) Anal. Biochem. 216:1-14). Host cell lysis and thermal cycling steps were carried out in a single reaction mixture. Samples were processed and stored in 384-well plates, and the concentration of amplified plasmid DNA was quantified fluorometrically using PICOGREEN dye (Molecular Probes, Eugene Oreg.) and a FLUOROSKAN II fluorescence scanner (Labsystems Oy, Helsinki, Finland).

[0348] III. Sequencing and Analysis

[0349] Incyte cDNA recovered in plasmids as described in Example II were sequenced as follows. Sequencing reactions were processed using standard methods or high-throughput instrumentation such as the ABI CATALYST 800 (Applied Biosystems) thermal cycler or the PTC-200 thermal cycler (MJ Research) in conjunction with the HYDRA microdispenser (Robbins Scientific) or the MICROLAB 2200 (Hamilton) liquid transfer system. cDNA sequencing reactions were prepared using reagents provided by Amersham Biosciences or supplied in ABI sequencing kits such as the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). Electrophoretic separation of cDNA sequencing reactions and detection of labeled polynucleotides were carried out using the MEGABACE 1000 DNA sequencing system (Amersham Biosciences); the ABI PRISM 373 or 377 sequencing system (Applied Biosystems) in conjunction with standard ABI protocols and base calling software; or other sequence analysis systems known in the art. Reading frames within the cDNA sequences were identified using standard methods (Ausubel et al., supra, ch. 7). Some of the cDNA sequences were selected for extension using the techniques disclosed in Example VI.

[0350] The polynucleotide sequences derived from Incyte cDNAs were validated by removing vector, linker, and poly(A) sequences and by masking ambiguous bases, using algorithms and programs based on BLAST, dynamic programming, and dinucleotide nearest neighbor analysis. The Incyte cDNA sequences or translations thereof were then queried against a selection of public databases such as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases, and BLOCKS, PRINTS, DOMO, PRODOM; PROTEOME databases with sequences from Homo sapiens, Rattus norvegicus, Mus musculus, Caenorhabditis elegans, Saccharornyces cerevisiae, Schizosaccharomyces pombe, and Candida albicans(Incyte Genomics, Palo Alto Calif.); hidden Markov model (H)-based protein family databases such as PFAM, INCY, and TIGRFAM (aft, D. H. et al. (2001) Nucleic Acids Res. 29:41-43); and HMM-based protein domain databases such as SMART (Schultz, J. et al. (1998) Proc. Natl. Acad. Sci. USA 95:5857-5864; Letunic, I. et al. (2002) Nucleic Acids Res. 30.242-244). (is a probabilistic approach which analyzes consensus primary structures of gene families; see, for example, Eddy, S. R. (1996) Curr. Opin. Struct. Biol. 6:361-365.) The queries were performed using programs based on BLAST, PASTA, BLIPS, and HMMER. The Incyte cDNA sequences were assembled to produce full length polynucleotide sequences. Alternatively, GenBank cDNAs, GenBank ESTs, stitched sequences, stretched sequences, or Genscan-predicted coding sequences (see Examples IV and V) were used to extend Incyte cDNA assemblages to full length. Assembly was performed using programs based on Phred, Phrap, and Consed, and cDNA assemblages were screened for open reading frames using programs based on GeneMark, BLAST, and FASTA. The full length polynucleotide sequences were translated to derive the corresponding full length polypeptide sequences. Alternatively, a polypeptide may begin at any of the methionine residues of the full length translated polypeptide. Full length polypeptide sequences were subsequently analyzed by querying against databases such as the GenBank protein databases (genpept), SwissProt, the PROTEOME databases, BLOCKS, PRINTS, DOMO, PRODOM, Prosite, hidden Markov model (HUM)-based protein family databases such as PFAM, INCY, and TIGRFAM; and HMM-based protein domain databases such as SMART. Full length polynucleotide sequences are also analyzed using MACDNASIS PRO software (Hitachi Software Engineering, South San Francisco Calif.) and LASERGENE software (DNASTAR). Polynucleotide and polypeptide sequence alignments are generated using default parameters specified by the CLUSTAL algorithm as incorporated into the MEGALIGN multisequence alignment program (DNASTAR), which also calculates the percent identity between aligned sequences.

[0351] Table 7 summarizes the tools, programs, and algorithms used for the analysis and assembly of Incyte cDNA and full length sequences and provides applicable descriptions, references, and threshold parameters. The first column of Table 7 shows the tools, programs, and algorithms used, the second column provides brief descriptions thereof, the third column presents appropriate references, all of which are incorporated by reference herein in their entirety, and the fourth column presents, where applicable, the scores, probability values, and other parameters used to evaluate the strength of a match between two sequences (the higher the score or the lower the probability value, the greater the identity between two sequences).

[0352] The programs described above for the assembly and analysis of full length polynucleotide and polypeptide sequences were also used to identify polynucleotide sequence fragments from SEQ ID NO:24-46. Fragments from about 20 to about 4000 nucleotides which are useful in hybridization and amplification technologies are described in Table 4, column 2.

[0353] IV. Identification and Editing of Coding Sequences from Genomic DNA Putative intracellular signaling molecules were initially identified by running the Genscan gene identification program against public genomic sequence databases (e.g., gbpri and gbhtg). Genscan is a general-purpose gene identification program which analyzes genomic DNA sequences from a variety of organisms (Burge, C. and S. Karlin (1997) J. Mol. Biol. 268:78-94; Burge, C. and S. Karlin (1998) Curr. Opin. Struct. Biol. 8:346-354). The program concatenates predicted exons to form an assembled cDNA sequence extending from a methionine to a stop codon. The output of Genscan is a PASTA database of polynucleotide and polypeptide sequences. The maximum range of sequence for Genscan to analyze at once was set to 30 kb. To determine which of these Genscan predicted cDNA sequences encode intracellular signaling molecules, the encoded polypeptides were analyzed by querying against PFAM models for intracellular signaling molecules. Potential intracellular signaling molecules were also identified by homology to Incyte cDNA sequences that had been annotated as intracellular signaling molecules. These selected Genscan-predicted sequences were then compared by BLAST analysis to the genpept and gbpri public databases. Where necessary, the Genscan-predicted sequences were then edited by comparison to the top BLAST hit from genpept to correct errors in the sequence predicted by Genscan, such as extra or omitted exons. BLAST analysis was also used to find any Incyte cDNA or public cDNA coverage of the Genscan-predicted sequences, thus providing evidence for transcription. When Incyte cDNA coverage was available, this information was used to correct or confirm the Genscan predicted sequence. Full length polynucleotide sequences were obtained by assembling Genscan-predicted coding sequences with Incyte cDNA sequences and/or public cDNA sequences using the assembly process described in Example m. Alternatively, full length polynucleotide sequences were derived entirely from edited or unedited Genscan-predicted coding sequences.

[0354] V. Assembly of Genomic Sequence Data with cDNA Sequence Data “Stitched” Sequences

[0355] Partial cDNA sequences were extended with exons predicted by the Genscan gene identification program described in Example IV. Partial cDNAs assembled as described in Example III were mapped to genomic DNA and parsed into clusters containing related cDNAs and Genscan exon predictions from one or more genomic sequences. Bach cluster was analyzed using an algorithm based on graph theory and dynamic programming to integrate cDNA and genomic information, generating possible splice variants that were subsequently confirmed, edited, or extended to create a full length sequence. Sequence intervals in which the entire length of the interval was present on more than one sequence in the cluster were identified, and intervals thus identified were considered to be equivalent by transitivity. For example, if an interval was present on a cDNA and two genomic sequences, then all three intervals were considered to be equivalent. This process allows unrelated but consecutive genomic sequences to be brought together, bridged by cDNA sequence. Intervals thus identified were then “stitched” together by the stitching algorithm in the order that they appear along their parent sequences to generate the longest possible sequence, as well as sequence variants. Linkages between intervals which proceed along one type of parent sequence (cDNA to cDNA or genomic sequence to genomic sequence) were given preference over linkages which change parent type (cDNA to genomic sequence). The resultant stitched sequences were translated and compared by BLAST analysis to the genpept and gbpri public databases. Incorrect exons predicted by Genscan were corrected by comparison to the top BLAST hit from genpept Sequences were further extended with additional cDNA sequences, or by inspection of genomic DNA, when necessary.

[0356] “Stretched” Sequences

[0357] Partial DNA sequences were extended to full length with an algorithm based on BLAST analysis. First, partial cDNAs assembled as described in Example m were queried against public databases such as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases using the BLAST program. The nearest GenBank protein homolog was then compared by BLAST analysis to either Incyte cDNA sequences or GenScan exon predicted sequences described in Example IV. A chimeric protein was generated by using the resultant high-scoring segment pairs (HSPs) to map the translated sequences onto the GenBank protein homolog. Insertions or deletions may occur in the chimeric protein with respect to the original GenBank protein homolog. The GenBank protein homolog, the chimeric protein, or both were used as probes to search for homologous genomic sequences from the public human genome databases. Partial DNA sequences were therefore “stretched” or extended by the addition of homologous genomic sequences. The resultant stretched sequences were examined to determine whether it contained a complete gene.

[0358] VI. Chromosomal Mapping of INTSIG Encoding Polynucleotides

[0359] The sequences which were used to assemble SEQ ID NO:24-46 were compared with sequences from the Incyte LIFESEQ database and public domain databases using BLAST and other implementations of the Smith-Waterman algorithm. Sequences from these databases that matched SEQ ID NO:24-46 were assembled into clusters of contiguous and overlapping sequences using assembly algorithms such as Phrap (Table 7). Radiation hybrid and genetic mapping data available from public resources such as the Stanford Human Genome Center (SHGC), Whitehead Institute for Genome Research (WIGR), and Généthon were used to determine if any of the clustered sequences had been previously mapped. Inclusion of a mapped sequence in a cluster resulted in the assignment of all sequences of that cluster, including its particular SEQ ID NO:, to that map location.

[0360] Map locations are represented by ranges, or intervals, of human chromosomes. The map position of an interval, in centiMorgans, is measured relative to the terminus of the chromosome's p-arm. (The centiMorgan (cM) is a unit of measurement based on recombination frequencies between chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA in humans, although this can vary widely due to hot and cold spots of recombination.) The cM distances are based on genetic markers mapped by Généthon which provide boundaries for radiation hybrid markers whose sequences were included in each of the clusters. Human genome maps and other resources available to the public, such as the NCBI “GeneMap′99” World Wide Web site (http://www.ncbi.nlm.nih.gov/genemap/), can be employed to determine if previously identified disease genes map within or in proximity to the intervals indicated above.

[0361] VII. Analysis of Polynucleotide Expression

[0362] Northern analysis is a laboratory technique used to detect the presence of a transcript of a gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs from a particular cell type or tissue have been bound (Sambrook, supra, ch 7; Ausubel et al., supra, ch. 4).

[0363] Analogous computer techniques applying BLAST were used to search for identical or related molecules in cDNA databases such as GenBank or LIFESEQ (Incyte Genomics). This analysis is much faster than multiple membrane-based hybridizations. In addition, the sensitivity of the computer search can be modified to determine whether any particular match is categorized as exact or similar. The basis of the search is the product score, which is defined as: $\frac{{BLAST}\quad {Score} \times {Percent}\quad {Identity}}{5 \times {minimum}\quad \left\{ {{{length}\left( {{Seq}.\quad 1} \right)},{{length}\left( {{Seq}.\quad 2} \right)}} \right\}}$

[0364] The product score takes into account both the degree of similarity between two sequences and the length of the sequence match. The product score is a normalized value between 0 and 100, and is calculated as follows: the BLAST score is multiplied by the percent nucleotide identity and the product is divided by (5 times the length of the shorter of the two sequences). The BLAST score is calculated by assigning a score of +5 for every base that matches in a high-scoring segment pair (HSP), and 4 for every mismatch Two sequences may share more than one HSP (separated by gaps). If there is more than one HSP, then the pair with the highest BLAST score is used to calculate the product score. The product score represents a balance between fractional overlap and quality in a BLAST alignment For example, a product score of 100 is produced only for 100% identity over the entire length of the shorter of the two sequences being compared. A product score of 70 is produced either by 100% identity and 70% overlap at one end, or by 88% identity and 100% overlap at the other. A product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% identity and 100% overlap.

[0365] Alternatively, polynucleotides encoding INTSIG are analyzed with respect to the tissue sources from which they were derived. For example, some full length sequences are assembled, at least in part, with overlapping Incyte cDNA sequences (see Example III). Each cDNA sequence is derived from a cDNA library constructed from a human tissue. Each human tissue is classified into one of the following organ/tissue categories: cardiovascular system, connective tissue; digestive system; embryonic structures; endocrine system; exocrine glands; genitalia, female; genitalia, male; germ cells; hemic and immune system; liver; musculoskeletal system; nervous system; pancreas; respiratory system; sense organs; skin; stomatognathic system; unclassified/mixed; or urinary tract. The number of libraries in each category is counted and divided by the total number of libraries across all categories. Similarly, each human tissue is classified into one of the following disease/condition categories: cancer, cell line, developmental, inflammation, neurological, trauma, cardiovascular, pooled, and other, and the number of libraries in each category is counted and divided by the total number of libraries across all categories. The resulting percentages reflect the tissue- and disease-specific expression of cDNA encoding INTSIG. cDNA sequences and cDNA library/tissue information are found in the LIFESEQ GOLD database (Incyte Genomics, Palo Alto Calif.).

[0366] VIII. Extension of INTSIG Encoding Polynucleotides

[0367] Full length polynucleotides are produced by extension of an appropriate fragment of the full length molecule using oligonucleotide primers designed from this fragment. One primer was synthesized to initiate 5′extension of the known fragment, and the other primer was synthesized to initiate 3′extension of the known fragment. The initial primers were designed using OLIGO 4.06 software (National Biosciences), or another appropriate program, to be about 22 to 30 nucleotides in length, to have a GC content of about 50% or more, and to anneal to the target sequence at temperatures of about 68° C. to about 72° C. Any stretch of nucleotides which would result in hairpin structures and primer-primer dimerizations was avoided.

[0368] Selected human cDNA libraries were used to extend the sequence. If more than one extension was necessary or desired, additional or nested sets of primers were designed.

[0369] High fidelity amplification was obtained by PCR using methods well known in the art PCR was performed in 96-well plates using the PTC-200 thermal cycler (MJ Research, Inc.). The reaction mix contained DNA template, 200 nmol of each primer, reaction buffer containing Mg²⁺, (NH₄)₂SO₄, and 2-mercaptoethanol, Taq DNA polymerase (Amersham Biosciences), ELONGASE enzyme (Invitrogen), and Pfu DNA polymerase (Stratagene), with the following parameters for primer pair PCI A and PCI B: Step 1: 94° C., 3 min; Step 2: 94° C., 15 sec; Step 3: 60° C., 1 min; Step 4: 689C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68° C., 5 min; Step 7: storage at 4° C. In the alternative, the parameters for primer pair T7 and SK+ were as follows: Step 1: 94° C., 3 min; Step 2: 94° C., 15 sec; Step 3: 57° C., 1 min; Step 4: 68° C., 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68° C., 5 min; Step 7: storage at 4° C.

[0370] The concentration of DNA in each well was determined by dispensing 100 μl PICOGREEN quantitation reagent (0.25% (v/v) PICOGREEN; Molecular Probes, Eugene Oreg.) dissolved in 1× TE and 0.5 μl of undiluted PCR product into each well of an opaque fluorimeter plate (Corning Costar, Acton Mass.), allowing the DNA to bind to the reagent. The plate was scanned in a Fluoroskan II (Labsystems Oy, Helsinki, Finland) to measure the fluorescence of the sample and to quantify the concentration of DNA. A 5 μl to 10 μl aliquot of the reaction mixture was analyzed by electrophoresis on a 1% agarose gel to determine which reactions were successful in extending the sequence.

[0371] The extended nucleotides were desalted and concentrated, transferred to 384-well plates, digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison Wis.), and sonicated or sheared prior to religation into pUC 18 vector (Amersham Biosciences). For shotgun sequencing, the digested nucleotides were separated on low concentration (0.6 to 0.8%) agarose gels, fragments were excised, and agar digested with Agar ACE (Promega). Extended clones were religated using T4 ligase (New England Biolabs, Beverly Mass.) into pUC 18 vector (Amersham Biosciences), treated with Pfu DNA polymerase (Stratagene) to fill-in restriction site overhangs, and transfected into competent E. coli cells. Transformed cells were selected on antibiotic-containing media, and individual colonies were picked and cultured overnight at 37° C. in 384-well plates in LB/2×carb liquid media.

[0372] The cells were lysed, and DNA was amplified by PCR using Taq DNA polymerase (Amersham Biosciences) and Pfu DNA polymerase (Stratagene) with the following parameters: Step 1: 94° C., 3 min; Step 2: 94° C., 15 sec; Step 3: 60° C., 1 min; Step 4: 72° C., 2 min; Step 5: steps 2, 3, and 4 repeated 29 times; Step 6: 72° C., 5 min; Step 7: storage at 4° C. DNA was quantified by PICOGREEN reagent (Molecular Probes) as described above. Samples with low DNA recoveries were reamplified using the same conditions as described above. Samples were diluted with 20% dimethysulfoxide (1:2, v/v), and sequenced using DYENAMIC energy transfer sequencing primers and the DYENAMIC DIRECT kit (Amersham Biosciences) or the ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems).

[0373] In like manner, full length polynucleotides are verified using the above procedure or are used to obtain 5′ regulatory sequences using the above procedure along with oligonucleotides designed for such extension, and an appropriate genomic library.

[0374] IX. Identification of Single Nucleotide Polymorphisms in INTSIG Encoding Polynucleotides

[0375] Common DNA sequence variants known as single nucleotide polymorphisms (SNPs) were identified in SEQ ID NO:24-46 using the LIFESEQ database (Incyte Genomics). Sequences from the same gene were clustered together and assembled as described in Example m, allowing the identification of all sequence variants in the gene. An algorithm consisting of a series of filters was used to distinguish SNPs from other sequence variants. Preliminary filters removed the majority of basecall errors by requiring a minimum Phred quality score of 15, and removed sequence alignment errors and errors resulting from improper trimming of vector sequences, chimeras, and splice variants. An automated procedure of advanced chromosome analysis analysed the original chromatogram files in the vicinity of the putative SNP. Clone error filters used statistically generated algorithms to identify errors introduced during laboratory processing, such as those caused by reverse transcriptase, polymerase, or somatic mutation. Clustering error filters used statistically generated algorithms to identify errors resulting from clustering of close homologs or pseudogenes, or due to contamination by non-human sequences. A final set of filters removed duplicates and SNPs found in immunoglobulins or T-cell receptors.

[0376] Certain SNPs were selected for further characterization by mass spectrometry using the high throughput MASSARRAY system (Sequenom, Inc.) to analyze allele frequencies at the SNP sites in four different human populations. The Caucasian population comprised 92 individuals (46 male, 46 female), including 83 from Utah, four French, three Venezualan, and two Amish individuals. The African population comprised 194 individuals (97 male, 97 female), all African Americans. The Hispanic population comprised 324 individuals (162 male, 162 female), all Mexican Hispanic. The Asian population comprised 126 individuals (64 male, 62 female) with a reported parental breakdown of 43% Chinese, 31% Japanese, 13% Korean, 5% Vietnamese, and 8% other Asian. Allele frequencies were first analyzed in the Caucasian population; in some cases those SNPs which showed no allelic variance in this population were not further tested in the other three populations.

[0377] X. Labeling and Use of Individual Hybridization Probes

[0378] Hybridization probes derived from SEQ ID NO-24-46 are employed to screen cDNAs, genomic DNAs, or mRNAs. Although the labeling of oligonucleotides, consisting of about 20 base pairs, is specifically described, essentially the same procedure is used with larger nucleotide fragments. Oligonucleotides are designed using state-of-the-art software such as OLIGO 4.06 software (National Biosciences) and labeled by combining 50 pmol of each oligomer, 250 μCi of [γ³²P] adenosine triphosphate (Amersham Biosciences), and T4 polynucleotide kinase (DuPont NEN, Boston Mass.). The labeled oligonucleotides are substantially purified using a SEPHADEX G-25 superfine size exclusion dextran bead column (Amersham Biosciences). An aliquot containing 107 counts per minute of the labeled probe is used in a typical membrane-based hybridization analysis of human genomic DNA digested with one of the following endonucleases: Ase I, Bgl II, Eco RI, Pst I, Xba I, or Pvu II (DuPont NEN).

[0379] The DNA from each digest is fractionated on a 0.7% agarose gel and transferred to nylon membranes (Nytran Plus, Schleicher & Schuell, Durham N. H.). Hybridization is carried out for −16 hours at 40° C. To remove nonspecific signals, blots are sequentially washed at room temperature under conditions of up to, for example, 0.1× saline sodium citrate and 0.5% sodium dodecyl sulfate. Hybridization patterns are visualized using autoradiography or an alternative imaging means and compared.

[0380] XI. Microarrays

[0381] The linkage or synthesis of array elements upon a microarray can be achieved utilizing photolithography, piezoelectric printing (inkjet printing; see, e.g., Baldeschweiler et al., supra), mechanical microspotting technologies, and derivatives thereof. The substrate in each of the aforementioned technologies should be uniform and solid with a non-porous surface (Schena, M., ed. (1999) DNA Microarrays: A Practical Approach, Oxford University Press, London). Suggested substrates include silicon, silica, glass slides, glass chips, and silicon wafers. Alternatively, a procedure analogous to a dot or slot blot may also be used to arrange and link elements to the surface of a substrate using thermal, UV, chemical, or mechanical bonding procedures. A typical array may be produced using available methods and machines well known to those of ordinary skill in the art and may contain any appropriate number of elements (Schena, M. et al. (1995) Science 270:467-470; Shalon, D. et al. (1996) Genome Res. 6:639-645; Marshall A. and J. Hodgson (1998) Nat. Biotechnol. 16:27-31).

[0382] Full length cDNAs, Expressed Sequence Tags (ESTs), or fragments or oligomers thereof may comprise the elements of the microarray. Fragments or oligomers suitable for hybridization can be selected using software well known in the art such as LASERGENE software (DNASTAR). The array elements are hybridized with polynucleotides in a biological sample. The polynucleotides in the biological sample are conjugated to a fluorescent label or other molecular tag for ease of detection. After hybridization, nonhybridized nucleotides from the biological sample are removed, and a fluorescence scanner is used to detect hybridization at each array element. Alternatively, laser desorbtion and mass spectrometry may be used for detection of hybridization. The degree of complementarity and the relative abundance of each polynucleotide which hybridizes to an element on the microarray may be assessed. In one embodiment, microarray preparation and usage is described in detail below.

[0383] Tissue or Cell Sample Preparation

[0384] Total RNA is isolated from tissue samples using the guanidinium thiocyanate method and poly(A)⁺ RNA is purified using the oligo-(dT) cellulose method. Each poly(A)⁺ RNA sample is reverse transcribed using MMLV reverse-transcriptase, 0.05 pg/μl oligo-(dT) primer (21mer), 1× first strand buffer, 0.03 units/μl RNase inhibitor, 500 μM dATP, 500 μM dGTP, 500 μM dTTP, 40 μM dCTP, 40 μM dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Biosciences). The reverse transcription reaction is performed in a 25 nm volume containing 200 ng poly(A)+ RNA with GEMBRIGHT kits (Incyte). Specific control poly(A)⁺ RNAs are synthesized by in vitro transcription from non-coding yeast genomic DNA. After incubation at 37° C. for 2 hr, each reaction sample (one with Cy3 and another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and incubated for 20 minutes at 85° C. to the stop the reaction and degrade the RNA. Samples are purified using two successive CHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc. (CLONTECH), Palo Alto Calif.) and after combining, both reaction samples are ethanol precipitated using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The sample is then dried to completion using a SpeedVAC (Savant Instruments Inc., Holbrook N.Y.) and resuspended in 14 μl 5×SSC/0.2% SDS.

[0385] Microarray Preparation

[0386] Sequences of the present invention are used to generate array elements. Each array element is amplified from bacterial cells containing vectors with cloned cDNA inserts. PCR amplification uses primers complementary to the vector sequences flanking the cDNA insert. Array elements are amplified in thirty cycles of PCR from an initial quantity of 1-2 ng to a final quantity greater than 5 μg. Amplified array elements are then purified using SEPHACRYL-400 (Amersham Biosciences).

[0387] Purified array elements are immobilized on polymer-coated glass slides. Glass microscope slides (Corning) are cleaned by ultrasound in 0.1% SDS and acetone, with extensive distilled water washes between and after treatments. Glass slides are etched in 4% hydrofluoric acid (VWR Scientific Products Corporation (VWR), West Chester Pa.), washed extensively in distilled water, and coated with 0.05% aminopropyl silane (Sigma) in 95% ethanol. Coated slides are cured in a 110° C. oven.

[0388] Array elements are applied to the coated glass substrate using a procedure described in U.S. Pat. No. 5,807,522, incorporated herein by reference. 1 μl of the array element DNA, at an average concentration of 100 ng/μl, is loaded into the open capillary printing element by a high-speed robotic apparatus. The apparatus then deposits about 5 nl of array element sample per slide.

[0389] Microarrays are UV-crosslinked using a STRATALINKER UV-crosslinker (Stratagene). Microarrays are washed at room temperature once in 0.2% SDS and three times in distilled water. Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate buffered saline (PBS) (Tropix, Inc., Bedford Mass.) for 30 minutes at 60° C. followed by washes in 0.2% SDS and distilled water as before.

[0390] Hybridization

[0391] Hybridization reactions contain 9 μl of sample mixture consisting of 0.2 μg each of Cy3 and Cy5 labeled cDNA synthesis products in 5×SSC, 0.2% SDS hybridization buffer. The sample mixture is heated to 65° C. for 5 minutes and is aliquoted onto the microarray surface and covered with an 1.8 cm² coverslip. The arrays are transferred to a waterproof chamber having a cavity just slightly larger than a microscope slide. The chamber is kept at 100% humidity internally by the addition of 140 μl of 5×SSC in a corner of the chamber. The chamber containing the arrays is incubated for about 6.5 hours at 60° C. The arrays are washed for 10 mill at 45° C. in a first wash buffer (1×SSC, 0.1% SDS), three times for 10 minutes each at 45° C. in a second wash buffer (0.1×SSC), and dried.

[0392] Detection

[0393] Reporter-labeled hybridization complexes are detected with a microscope equipped with an Innova 70 mixed gas 10 W laser (Coherent, Inc., Santa Clara Calif.) capable of generating spectral lines at 488 nm for excitation of Cy3 and at 632 nm for excitation of Cy5. The excitation laser light is focused on the array using a 20× microscope objective (Nikon, Inc., Melville N.Y.). The slide containing the array is placed on a computer-controlled X-Y stage on the microscope and raster-scanned past the objective. The 1.8 cm×1.8 cm array used in the present example is scanned with a resolution of 20 micrometers.

[0394] In two separate scans, a mixed gas multiline laser excites the two fluorophores sequentially. Emitted light is split, based on wavelength, into two photomultiplier tube detectors (PMT R1477, Hamamatsu Photonics Systems, Bridgewater N.J.) corresponding to the two fluorophores. Appropriate filters positioned between the array and the photomultiplier tubes are used to filter the signals. The emission maxima of the fluorophores used are 565 nm for Cy3 and 650 nm for Cy5. Each array is typically scanned twice, one scan per fluorophore using the appropriate filters at the laser source, although the apparatus is capable of recording the spectra from both fluorophores simultaneously.

[0395] The sensitivity of the scans is typically calibrated using the signal intensity generated by a cDNA control species added to the sample mixture at a known concentration. A specific location on the array contains a complementary DNA sequence, allowing the intensity of the signal at that location to be correlated with a weight ratio of hybridizing species of 1:100,000. When two samples from different sources (e.g., representing test and control cells), each labeled with a different fluorophore, are hybridized to a single array for the purpose of identifying genes that are differentially expressed, the calibration is done by labeling samples of the calibrating cDNA with the two fluorophores and adding identical amounts of each to the hybridization mixture.

[0396] The output of the photomultiplier tube is digitized using a 12-bit RTI-835H analog-to-digital (A/D) conversion board (Analog Devices, Inc., Norwood Mass.) installed in an IBM-compatible PC computer. The digitized data are displayed as an image where the signal intensity is mapped using a linear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red (high signal). The data is also analyzed quantitatively. Where two different fluorophores are excited and measured simultaneously, the data are first corrected for optical crosstalk (due to overlapping emission spectra) between the fluorophores using each fluorophore's emission spectrum.

[0397] A grid is superimposed over the fluorescence signal image such that the signal from each spot is centered in each element of the grid. The fluorescence signal within each element is then integrated to obtain a numerical value corresponding to the average intensity of the signal. The software used for signal analysis is the GEMTOOLS gene expression analysis program (Incyte). Array elements that exhibited at least about a two-fold change in expression, a signal-to-background ratio of at least 2.5, and an element spot size of at least 40% were identified as differentially expressed using the GEMTOOLS program (Incyte Genomics).

[0398] Expression

[0399] In one example, SEQ ID NO:31 showed differential expression as determined by microarray analysis. Histological and molecular evaluation of breast tumors reveals that the development of breast cancer evolves through a multi-step process whereby pre-malignant mammary epithelial cells undergo a relatively defined sequence of events leading to tumor formation. A cross-comparison experimental design was used to evaluate the expression of cDNAs from four human breast tumor cell lines (MCF-7, T-47D, Sk-Br-3, and MDA-mb-231) at various stages of tumor progression, as compared to a non-malignant mammary epithelial cell line, HMEC (Clonetics, San Diego, Calif.). All cell cultures were propagated in media according to the supplier's recommendations and grown to 70-80% confluence prior to RNA isolation.

[0400] The expression of SEQ ID NO:31 was increased at least two-fold in MCF-7 cells, a nonmalignant breast adenocarcinoma cell line isolated from the pleural effusion of a 69-year old female. MCF-7 has retained characteristics of the mammary epithelium such as the ability to process estradiol via cytoplasmic estrogen receptors and the capacity to form domes in culture. The expression of SEQ ID NO:31 was also increased by at least two-fold in T47D cells, a breast carcinoma cell line isolated from a pleural effusion obtained from a 54-year old female with an infiltrating ductal carcinoma of the breast. The expression of SEQ ID NO:31 was also found to be increased by at least two-fold in Sk-Br-3 cells, a breast adenocarcinoma cell line isolated from a malignant pleural effusion of a 43-year old female. Sk-Br-3 cells form poorly differentiated adenocarcinoma when injected into nude mice. The expression of SEQ ID NO:31 was also increased by at least two-fold in MDA-mb-231 cells, a breast tumor cell line isolated from the pleural effusion of a 51-year old female. MDA-mb-231 cells form poorly differentiated adenocarcinoma in nude mice and ALS-treated BALB/G mice. These cells also express the Wnt3 oncogene, EFG, and TGF-α. These experiments indicate that SEQ ID NO:31 was significantly overexpressed in the breast tumor cell lines.

[0401] In another example, SEQ ID NO:39 and SEQ ID NO:42 showed differential expression in breast tumor cell lines, as determined by microarray analysis. The expression of SEQ ID NO:39 and SEQ ID NO:42 was decreased by at least two fold in breast tumor cell lines that were harvested from donors with various stages of tumor progression and malignant transformation, when compared to the expression levels of SEQ ID NO:39 or SEQ ID NO:42 in normal breast epithelial cells, respectively. Therefore, in various embodiments, SEQ ID NO:31, SEQ ID NO:39, or SEQ ID NO:42 can be used for one or more of the following: i) monitoring treatment of breast cancer, ii) diagnostic assays for breast cancer, and iii) developing therapeutics and/or other treatments for breast cancer.

[0402] In another example, SEQ ID NO:32 showed differential expression in brain cingulate from a patient with Alzheimer's disease as determined by microarray analysis. The expression of SEQ ID NO:32 was increased at least two-fold in cingulate tissue with A12heimer's disease compared to matched microscopically normal tissue from the same donor. Therefore, SEQ ID NO:32 can be used for one or more of the following: i) monitoring treatment of Alzheimer's disease, ii) diagnostic assays for Alzheimer's disease, and iii) developing therapeutics and/or other treatments for Alzheimer's disease.

[0403] In yet another example, SEQ ID NO:39 showed differential expression in toxicology studies as determined by microarray analysis. The expression of SEQ ID NO:39 was decreased by at least two fold inhuman C3A liver cell line treated with various drugs (e.g., steroids) relative to untreated C3A cells. The human C3A cell line is a clonal derivative of HepG2/C3 (hepatoma cell line, isolated from a 15-year-old male with liver tumor), which was selected for strong contact inhibition of growth. The C3A cell line is well established as an in vitro model of the mature human liver (Mickelson, J. K. et al (1995) Hepatology 22:866-875; Nagendra, A. R. et al. (1997) Am. J. Physiol. 272:G408-G416). Effects upon liver metabolism are important to understanding the pharmacodynamics of a drug. Therefore, in various embodiments, SEQ ID NO:39 can be used for one or more of the following: i) diagnosis and monitoring of liver, endocrine, and reproductive diseases, ii) diagnosis and monitoring of liver toxicity and clearance, and iii) developing therapeutics and/or other treatments for liver toxicity and clearance.

[0404] In yet another example, SEQ ID NO:42 showed differential expression in prostate cancer cell lines, as determined by microarray analysis. The prostate carcinoma cell lines were isolated from metastatic sites in the brain, bone, and lymph nodes of donors with widespread metastatic prostate carcinoma. The control prostate epithelial cell line was isolated from a normal donor. Expression of SEQ ID NO:42 was decreased by at least two fold in various prostate carcinoma cell lines relative to the expression levels measured in the control normal prostate epithelial cell line. Therefore, in various embodiments, SEQ ID NO:42 can be used for one or more of the following: i) monitoring treatment of prostate cancer, ii) diagnostic assays for prostate cancer, and iii) developing therapeutics and/or other treatments for prostate cancer.

[0405] XII. Complementary Polynucleotides

[0406] Sequences complementary to the INTSIG-encoding sequences, or any parts thereof, are used to detect, decrease, or inhibit expression of naturally occurring INTSIG. Although use of oligonucleotides comprising from about 15 to 30 base pairs is described, essentially the same procedure is used with smaller or with larger sequence fragments. Appropriate oligonucleotides are designed using OLUGO 4.06 software (National Biosciences) and the coding sequence of INTSIG. To inhibit transcription, a complementary oligonucleotide is designed from the most unique 5′ sequence and used to prevent promoter binding to the coding sequence. To inhibit translation, a complementary oligonucleotide is designed to prevent ribosomal binding to the INTSIG-encoding transcript

[0407] XIII. Expression of INTSIG

[0408] Expression and purification of INTSIG is achieved using bacterial or virus-based expression systems. For expression of INTSIG in bacteria, cDNA is subcloned into an appropriate vector containing an antibiotic resistance gene and an inducible promoter that directs high levels of cDNA transcription. Examples of such promoters include, but are not limited to, the trp-lac (tac) hybrid promoter and the T5 or T7 bacteriophage promoter in conjunction with the lac operator regulatory element Recombinant vectors are transformed into suitable bacterial hosts, e.g., BL21 (DE3). Antibiotic resistant bacteria express INTSIG upon induction with isopropyl beta-D-thiogalactopyranoside (IPTG). Expression of INTSIG in eukaryotic cells is achieved by infecting insect or mammalian cell lines with recombinant Autographica californica nuclear polyhedrosis virus (AcMNPV), commonly known as baculovirus. The nonessential polyhedrin gene of baculovirus is replaced with cDNA encoding INTSIG by either homologous recombination or bacterial-mediated transposition involving transfer plasmid intermediates. Viral infectivity is maintained and the strong polyhedrin promoter drives high levels of cDNA transcription. Recombinant baculovirus is used to infect Spodoptera frugiperda (Sf9) insect cells in most cases, or human hepatocytes, in some cases. Infection of the latter requires additional genetic modifications to baculovirus (Engelhard, E. K et al. (1994) Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum. Gene Ther. 7:1937-1945).

[0409] In most expression systems, INTSIG is synthesized as a fusion protein with, e.g., glutathione S-transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step, affinity-based purification of recombinant fusion protein from crude cell lysates. GST, a 26-kilodalton enzyme from Schistosoma japonicum, enables the purification of fusion proteins on immobilized glutathione under conditions that maintain protein activity and antigenicity (Amersham Biosciences). Following purification, the GST moiety can be proteolytically cleaved from INTSIG at specifically engineered sites. FLAG, an 8-amino acid peptide, enables immunoaffinity purification using commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak). 6-His, a stretch of six consecutive histidine residues, enables purification on metal-chelate resins (QIAGEN). Methods for protein expression and purification are discussed in Ausubel et al. (supra, ch. 10 and 16). Purified INTSIG obtained by these methods can be used directly in the assays shown in Examples XVII and XVIII, where applicable.

[0410] XIV. Functional Assays

[0411] INTSIG function is assessed by expressing the sequences encoding INTSIG at physiologically elevated levels in mammalian cell culture systems. cDNA is subcloned into a mammalian expression vector containing a strong promoter that drives high levels of cDNA expression. Vectors of choice include PCMV SPORT plasmid (Invitrogen, Carlsbad Calif.) and PCR3.1 plasmid (Invitrogen), both of which contain the cytomegalovirus promoter. 5-10 μg of recombinant vector are transiently transfected into a human cell line, for example, an endothelial or hematopoietic cell line, using either liposome formulations or electroporation. 1-2 μg of an additional plasmid containing sequences encoding a marker protein are co-transfected. Expression of a marker protein provides a means to distinguish transfected cells from nontransfected cells and is a reliable predictor of cDNA expression from the recombinant vector. Marker proteins of choice include, e.g., Green Fluorescent Protein (GFP; Clontech), CD64, or a CD64-GFP fusion protein Flow cytometry (FCM), an automated, laser optics-based technique, is used to identify transfected cells expressing GFP or CD64-GFP and to evaluate the apoptotic state of the cells and other cellular properties. FCM detects and quantifies the uptake of fluorescent molecules that diagnose events preceding or coincident with cell death. These events include changes in nuclear DNA content as measured by staining of DNA with propidium iodide; changes in cell size and granularity as measured by forward light scatter and 90 degree side light scatter; down-regulation of DNA synthesis as measured by decrease in bromodeoxyuridine uptake; alterations in expression of cell surface and intracellular proteins as measured by reactivity with specific antibodies; and alterations in plasma membrane composition as measured by the binding of fluorescein-conjugated Annexin V protein to the cell surface. Methods in flow cytometry are discussed in Ormerod, M.G. (1994; Flow Cytometry, Oxford, New York N.Y.).

[0412] The influence of INTSIG on gene expression can be assessed using highly purified populations of cells transfected with sequences encoding INTSIG and either CD64 or CD64-GFP. CD64 and CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions of human immunoglobulin G (IgG). Transfected cells are efficiently separated from nontransfected cells using magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Lake Success N.Y.). mRNA can be purified from the cells using methods well known by those of skill in the art. Expression of mRNA encoding INTSIG and other genes of interest can be analyzed by northern analysis or microarray techniques.

[0413] XV. Production of INTSIG Specific Antibodies

[0414] INTSIG substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., Harrington, M. G. (1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to immunize animals (e.g., rabbits, mice, etc.) and to produce antibodies using standard protocols.

[0415] Alternatively, the INTSIG amino acid sequence is analyzed using LASERGENE software (DNASTAR) to determine regions of high immunogenicity, and a corresponding oligopeptide is synthesized and used to raise antibodies by means known to those of skill in the art Methods for selection of appropriate epitopes, such as those near the C-terminus or in hydrophilic regions are well described in the art (Ausubel et al., supra, ch. 11).

[0416] Typically, oligopeptides of about 15 residues in length are synthesized using an ABI 431A peptide synthesizer (Applied Biosystems) using FMOC chemistry and coupled to KLH (Sigma-Aldrich, St Louis Mo.) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to increase immunogenicity (Ausubel et al., supra). Rabbits are immunized with the oligopeptide-KLH complex in complete Freund's adjuvant Resulting antisera are tested for antipeptide and anti-INTSIG activity by, for example, binding the peptide or INTSIG to a substrate, blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radio-iodinated goat anti-rabbit IgG.

[0417] XVI. Purification of Naturally Occurring INTSIG Using Specific Antibodies

[0418] Naturally occurring or recombinant INTSIG is substantially purified by immunoaffinity chromatography using antibodies specific for INTSIG. An immunoaffinity column is constructed by covalently coupling anti-INTSIG antibody to an activated chromatographic resin, such as CNBr-activated SEPHAROSE (Amersham Biosciences). After the coupling, the resin is blocked and washed according to the manufacturer's instructions.

[0419] Media containing INTSIG are passed over the immunoaffinity column, and the column is washed under conditions that allow the preferential absorbance of INTSIG (e.g., high ionic strength buffers in the presence of detergent). The column is eluted under conditions that disrupt antibody/INTSIG binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such as urea or thiocyanate ion), and INTSIG is collected.

[0420] XVII. Identification of Molecules Which Interact with INTSIG

[0421] INTSIG, or biologically active fragments thereof, are labeled with ¹²⁵I Bolton-Hunter reagent (Bolton, A. E. and W. M. Hunter (1973) Biochem. J. 133:529-539). Candidate molecules previously arrayed in the wells of a multi-well plate are incubated with the labeled INTSIG, washed, and any wells with labeled INTSIG complex are assayed. Data obtained using different concentrations of INTSIG are used to calculate values for the number, affinity, and association of INTSIG with the candidate molecules.

[0422] Alternatively, molecules interacting with INTSIG are analyzed using the yeast two-hybrid system as described in Fields, S. and O. Song (1989; Nature 340:245-246), or using commercially available kits based on the two-hybrid system, such as the MATCHMAKER system (Clontech).

[0423] INTSIG may also be used in the PATHCALLING process (CuraGen Corp., New Haven Conn.) which employs the yeast two-hybrid system in a high-throughput manner to determine all interactions between the proteins encoded by two large libraries of genes (Nandabalan, K. et al. (2000) U.S. Pat. No. 6,057,101).

[0424] XVIII. Demonstration of INTSIG Activity

[0425] INTSIG activity is associated with its ability to form protein-protein complexes and is measured by its ability to regulate growth characteristics of NIH3T3 mouse fibroblast cells. A cDNA encoding INTSIG is subcloned into an appropriate eukaryotic expression vector. This vector is transfected into NIH3T3 cells using methods known in the art. Transfected cells are compared with non-transfected cells for the following quantifiable properties: growth in culture to high density, reduced attachment of cells to the substrate, altered cell morphology, and ability to induce tumors when injected into immunodeficient mice. The activity of INTSIG is proportional to the extent of increased growth or frequency of altered cell morphology in NIH3T3 cells transfected with INTSIG.

[0426] Alternatively, INTSIG activity is measured by binding of INTSIG to radiolabeled formin polypeptides containing the proline-rich region that specifically binds to SH3 containing proteins (Chan, D. C. et al. (1996) EMBO J. 15:1045-1054). Samples of INTSIG are run on SDS-PAGE gels, and transferred onto nitrocellulose by electroblotting. The blots are blocked for 1 hr at room temperature in TBST (137 mM NaCl, 2.7 mM KCl, 25 mM Tris (pH 8.0) and 0.1% Tween-20) containing non-fat dry milk. Blots are then incubated with TBST containing the radioactive formin polypeptide for 4 hrs to overnight After washing the blots four times with TBST, the blots are exposed to autoradiographic film. Radioactivity is quantitated by cutting out the radioactive spots and counting them in a radioisotope counter. The amount of radioactivity recovered is proportional to the activity of INTSIG in the assay.

[0427] Alternatively, PDE activity of INTSIG is measured by monitoring the conversion of a cyclic nucleotide (either cAMP or cGMP) to its nucleotide monophosphate. The use of tritium-containing substrates such as ³H-cAMP and 1H-cGMP, and 5′nucleotidase from snake venom, allows the PDE reaction to be followed using a scintillation counter. cAMP-specific PDE activity of INTSIG is assayed by measuring the conversion of ³H-cAMP to ³H-adenosine in the presence of INTSIG and 5′ nucleotidase. A one-step assay is run using a 100 μl reaction containing 50 mM Tris-HCl pH 7.5, 10 mM MgCl₂, 0.1 unit 5′nucleotidase (from Crotalus atrox venom), 0.0062-0.1 μM ³H-cAMP, and various concentrations of cAMP (0.0062-3 mM). The reaction is started by the addition of 25 μl of diluted enzyme supernatant. Reactions are run directly in mini Poly-Q scintillation vials (Beckman Instruments, Fullerton Calif.). Assays are incubated at 37° C. for a time period that would give less than 15% cAMP hydrolysis to avoid non-linearity associated with product inhibition. The reaction is stopped by the addition of 1 ml of Dowex (Dow Chemical, Midland Mich.) AG1x8 (Cl form) resin (1:3 slurry). Three ml of scintillation fluid are added, and the vials are mixed. The resin in the vials is allowed to settle for one hour before counting. Soluble radioactivity associated with ³H-adenosine is quantitated using a beta scintillation counter. The amount of radioactivity recovered is proportional to the cAMP-specific PDE activity of INTSIG in the reaction. For inhibitor or agonist studies, reactions are carried out under the conditions described above, with the addition of 1% DMSO, 50 nM cAMP, and various concentrations of the inhibitor or agonist. Control reactions are carried out with all reagents except for the enzyme aliquot

[0428] In an alternative assay, cGMP-specific PDE activity of INTSIG is assayed by measuring the conversion of ³H-cGMP to ³H-guanosine in the presence of INTSIG and 5′nucleotidase. A one-step assay is run using a 100 μl reaction containing 50 mM Tris-HCl pH 7.5, 10 mM MgCl, 0.1 unit 5′ nucleotidase (from Crotalus atrox venom), and 0.0064-2.0 μM ³H-cGMP. The reaction is started by the addition of 25 μl of diluted enzyme supernatant Reactions are run directly in mini Poly-Q scintillation vials (Beckman Instruments). Assays are incubated at 37° C. for a time period that would yield less than 15% cGMP hydrolysis in order to avoid non-linearity associated with product inhibition. The reaction is stopped by the addition of 1 ml of Dowex (Dow Chemical, Midland Mich.) AG1x8 (Cl form) resin (1:3 slurry). Three ml of scintillation fluid are added, and the vials are mixed. The resin in the vials is allowed to settle for one hour before counting. Soluble radioactivity associated with ³H-guanosine is quantitated using a beta scintillation counter. The amount of radioactivity recovered is proportional to the cGMP-specific PDE activity of INTSIG in the reaction For inhibitor or agonist studies, reactions are carried out under the conditions described above, with the addition of 1% DMSO, 50 nM cGMP, and various concentrations of the inhibitor or agonist. Control reactions are carried out with all reagents except for the enzyme aliquot.

[0429] Alternatively, INTSIG protein kinase activity is measured by quantifying the phosphorylation of an appropriate substrate in the presence of gamma-labeled ³²P-ATP. INTSIG is incubated with the substrate, ³²P-ATP, and an appropriate kinase buffer. The ³²P incorporated into the product is separated from free ³²P-ATP by electrophoresis, and the incorporated ³²P is quantified using a beta radioisotope counter. The amount of incorporated ³²P is proportional to the protein kinase activity of INTSIG in the assay. A determination of the specific amino acid residue phosphorylated by protein kinase activity is made by phosphoamino acid analysis of the hydrolyzed protein. Alternatively, an assay for INTSIG protein phosphatase activity measures the hydrolysis of para-nitrophenyl phosphate (PNPP). INTSIG is incubated together with PNPP in HEPES buffer pH 7.5, in the presence of 0.1% β-mercaptoethanol at 37° C. for 60 min. The reaction is stopped by the addition of 6 ml of 10 N NaOH, and the increase in light absorbance of the reaction mixture at 410 nm resulting from the hydrolysis of PNPP is measured using a spectrophotometer. The increase in light absorbance is proportional to the activity of INTSIG in the assay (Diamond, R. L et al. (1994) Mol. Cell Biol. 14:3752-3762).

[0430] Alternatively, adenylyl cyclase activity of INTSIG is demonstrated by the ability to convert ATP to cAMP (Mittal, C. K. (1986) Meth. Enzymol 132:422-428). In this assay INTSIG is incubated with the substrate [α³²P]ATP, following which the excess substrate is separated from the product cyclic [32P] AMP. INTSIG activity is determined in 12×75 mm disposable culture tubes containing 5 μl of 0.6 M Tris-HCl, pH 7.5, 5 μl of 0.2 M Mgcl₂, 5 μl of 150 mM creatine phosphate containing 3 units of creatine phosphokinase, 5 μl of 4.0 mM 1-methyl-3-isobutylxanthine, 5 μl of 20 mM cAMP, 5 μl 20 mM dithiothreitol, 5 μl of 10 mM ATP, 10 μl [α³¹P]ATP (2-4×10⁶ cpm), and water in a total volume of 100 μl. The reaction mixture is prewarmed to 30° C. The reaction is initiated by adding INTSIG to the prewarmed reaction mixture. After 10-15 minutes of incubation at 30° C., the reaction is terminated by adding 25 μl of 30% ice-cold trichloroacetic acid (TCA). Zero-time incubations and reactions incubated in the absence of INTSIG are used as negative controls. Products are separated by ion exchange chromatography, and cyclic [³²P] AMP is quantified using a β-radioisotope counter. The INTSIG activity is proportional to the amount of cyclic [³²P] AMP formed in the reaction.

[0431] An alternative assay measures INTSIG-mediated G-protein signaling activity by monitoring the mobilization of Ca²⁺ as an indicator of the signal transduction pathway stimulation (Grynkiewicz, G. et al. (1985) 3. Biol. Chem. 260:3440; McColl, S. et al. (1993) J. Immunol. 150:4550-4555; Aussel, supra). The assay requires preloading neutrophils or T cells with a fluorescent dye such as FURA-2 or BCECF (Universal Imaging Corp, Westchester Pa.) whose emission characteristics are altered by Ca²⁺ binding. When the cells are exposed to one or more activating stimuli artificially. (e.g., anti-CD3 antibody ligation of the T cell receptor) or physiologically (e.g., by allogeneic stimulation), Ca²⁺ flux takes place. This flux can be observed and quantified by assaying the cells in a fluorometer or fluorescent activated cell sorter. Measurements of Ca²⁺ flux are compared between cells in their normal state and those transfected with INTSIG. Increased Ca²⁺ mobilization attributable to increased INTSIG concentration is proportional to INTSIG activity.

[0432] An assay for INTSIG activity measures the binding of INTSIG to Ca²⁺ using a Ca²⁺ overlay system (Weis, K et al. (1994) J. Biol. Chem. 269:19142-19150). Purified INTSIG is transferred and immobilized onto a nitrocellulose membrane. The membrane is washed three times with buffer (60 mM KCl, 5 mM MgCl₂, 10 mM imidazole-HCl, pH 6.8) and incubated in this buffer for 10 minutes with 1 μCi [⁴⁵Ca²⁺ ] (NEN-DuPont, Boston, Mass.). Unbound [⁴⁵Ca²⁺ ] is removed from the membrane by washing with water, and the membrane is dried. Membrane-bound [⁴⁵Ca²⁺ ] is detected by autoradiography and quantified using image analysis systems and software. INTSIG activity is proportional to the amount of [⁴⁵Ca²⁺ ] detected on the membrane.

[0433] Alternatively, calcium binding activity is measured by determining the ability of INTSIG to down-regulate mitosis. INTSIG can be expressed by transforming a mammalian cell line such as COS7, HeLa or CHO with an eukaryotic expression vector encoding INTSIG. Eukaryotic expression vectors are commercially available, and the techniques to introduce them into cells are well known to those skilled in the art. The cells are incubated for 48-72 hours after transformation under conditions appropriate for the cell line to allow expression of INTSIG. Phase microscopy is used to compare the mitotic index of transformed versus control cells. A decrease in the mitotic index indicates INTSIG activity.

[0434] Alternatively, GTP-binding activity of INTSIG is determined in an assay that measures the binding of INTSIG to [α-³²P]-labeled GTP. Purified INTSIG is first blotted onto filters and rinsed in a suitable buffer. The filters are then incubated in buffer containing radiolabeled [α-³²P]-GTP. The filters are washed in buffer to remove unbound GTP and counted in a radioisotope counter. Non-specific binding is determined in an assay that contains a 100-fold excess of unlabeled GTP. The amount of specific binding is proportional to the activity of INTSIG.

[0435] Alternatively, GTPase activity of INTSIG is determined in an assay that measures the conversion of [(α-³²P]-GTP to [α-³²P]-GDP. INTSIG is incubated with [α³²P]-GTP in buffer for an appropriate period of time, and the reaction is terminated by heating or acid precipitation followed by centrifugation. An aliquot of the supernatant is subjected to polyacrylamide gel electrophoresis (PAGE) to separate GDP and GTP together with unlabeled standards. The GDP spot is cut out and counted in a radioisotope counter. The amount of radioactivity recovered in GDP is proportional to the GTPase activity of INTSIG.

[0436] Alternatively, INTSIG activity is measured by quantifying the amount of a non-hydrolyzable GTP analogue, GTPγS, bound over a 10 minute incubation period. Varying amounts of INTSIG are incubated at 30C in 50 mM Tris buffer, pH 7.5, containing 1 mM dithiothreitol, 1 mM EDTA and 1 μM [³⁵S]GTPγS. Samples are passed through nitrocellulose filters and washed twice with a buffer consisting of 50 mM Tris-HCl, pH 7.8, 1 mM NaN₃, 10 mM MgCl, 1 mM EDTA, 0.5 mM dithiothreitol, 0.01 mM PMSF, and 200 mM NaCl. The filter-bound counts are measured by liquid scintillation to quantify the amount of bound [³⁵S]GTPγS. INTSIG activity may also be measured as the amount of GTP hydrolysed over a 10 minute incubation period at 37° C. INTSIG is incubated in 50 mM Tris-HCl buffer, pH 7.8, containing 1 mM dithiothreitol, 2 mM EDTA, 10 μM [α-³²P]GTP, and 1 μM H-rab protein. GTPase activity is initiated by adding MgCl to a final concentration of 10 mM. Samples are removed at various time points, mixed with an equal volume of ice-cold 0.5 mM EDTA, and frozen. Aliquots are spotted onto polyethyleneimine-cellulose thin layer chromatography plates, which are developed in 1M LiCl, dried, and autoradiographed. The signal detected is proportional to INTSIG activity.

[0437] Alternatively, INTSIG activity may be demonstrated as the ability to interact with its associated LMW GTPase in an in vitro binding assay. The candidate LMW GTPases are expressed as fusion proteins with glutathione S-transferase (GST), and purified by affinity chromatography on glutathione-Sepharose. The LMW GTPases are loaded with GDP by incubating 20 mM Tris buffer, pH 8.0, containing 100 mM NaCl, 2 mM EDTA, 5 mM MgCl₂, 0.2 mM DTT, 100 μM AMP-PNP and 10 μM GDP at 30° C. for 20 minutes. INTSIG is expressed as a FLAG fusion protein in a baculoviris system. Extracts of these baculoviris cells containing INTSIG-FLAG fusion proteins are precleared with GST beads, then incubated with GST-GTPase fusion proteins. The complexes formed are precipitated by glutathione-Sepharose and separated by SDS-polyacrylamide gel electrophoresis. The separated proteins are blotted onto nitrocellulose membranes and probed with commercially available anti-FLAG antibodies. INTSIG activity is proportional to the amount of INTSIG-FLAG fusion protein detected in the complex.

[0438] Another alternative assay to detect INTSIG activity is the use of a yeast two-hybrid system (Zalcman, G. et al (1996) J. Biol. Chem 271:30366-30374). Specifically, a plasmid such as pGAD1318 which may contain the coding region of INTSIG can be used to transform reporter LAO yeast cells which contain the reporter genes LacZ and HIS3 downstream from the binding sequences for LexA. These yeast cells have been previously transformed with a pLexA-Rab6-GDP (mouse) plasmid or with a plasmid which contains pLexA-lamin C. The p A-lamin C cells serve as a negative control. The transformed cells are plated on a histidine-free medium and incubated at 30° C. for 3 days. His⁺ colonies are subsequently patched on selective plates and assayed for O-galactosidase activity by a filter assay. INTSIG binding with Rab6-GDP is indicated by positive His⁺/lacZ⁺ activity for the cells transformed with the plasmid containing the mouse Rab6-GDP and negative His⁺/lacZ⁺ activity for those transformed with the plasmid containing lamin C.

[0439] Alternatively, INTSIG activity is measured by binding of INTSIG to a substrate which recognizes WD-40 repeats, such as ElonginB, by coimmunoprecipitation (Kamura, T. et al. (1998) Genes Dev. 12:3872-3881). Briefly, epitope tagged substrate and INTSIG are mixed and immunoprecipitated with commercial antibody against the substrate tag. The reaction solution is run on SDS-PAGE and the presence of INTSIG visualized using an antibody to the INTSIG tag. Substrate binding is proportional to INTSIG activity.

[0440] Alternatively, INTSIG activity is measured by its inclusion in coated vesicles. INTSIG can be expressed by transforming a mammalian cell line such as COS7, HeLa, or CHO with a eukaryotic expression vector encoding INTSIG. Eukaryotic expression vectors are commercially available, and the techniques to introduce them into cells are well known to those skilled in the art. A small amount of a second plasmid, which expresses any one of a number of marker genes, such as β-galactosidase, is co-transformed into the cells in order to allow rapid identification of those cells which have taken up and expressed the foreign DNA. The cells are incubated for 48-72 hours after transformation under conditions appropriate for the cell line to allow expression and accumulation of INTSIG and β-galactosidase.

[0441] In the alternative, INTSIG activity is measured by its ability to alter vesicle trafficking pathways. Vesicle trafficking in cells transformed with INTSIG is examined using fluorescence microscopy. Antibodies specific for vesicle coat proteins or typical vesicle trafficking substrates such as transferrin or the mannose-6-phosphate receptor are commercially available. Various cellular components such as ER, Golgi bodies, peroxisomes, endosomes, lysosomes, and the plasmalemma are examined. Alterations in the numbers and locations of vesicles in cells transformed with INTSIG as compared to control cells are characteristic of INTSIG activity. Transformed cells are collected and cell lysates are assayed for vesicle formation. A non-hydrolyzable form of GTP, GTPγS, and an ATP regenerating system are added to the lysate and the mixture is incubated at 370C for 10 minutes. Under these conditions, over 90% of the vesicles remain coated (Orci, L. et al. (1989) Cell 56:357-368). Transport vesicles are salt-released from the Golgi membranes, loaded under a sucrose gradient, centrifuged, and fractions are collected and analyzed by SDS-PAGE. Co-localization of INTSIG with clathrin or COP coatamer is indicative of INTSIG activity in vesicle formation. The contribution of INTSIG in vesicle formation can be confirmed by incubating lysates with antibodies specific for INTSIG prior to GTPγS addition. The antibody will bind to INTSIG and interfere with its activity, thus preventing vesicle formation.

[0442] Various modifications and variations of the described compositions, methods, and systems of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. It will be appreciated that the invention provides novel and useful proteins, and their encoding polynucleotides, which can be used in the drug discovery process, as well as methods for using these compositions for the detection, diagnosis, and treatment of diseases and conditions. Although the invention has been described in connection with certain embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Nor should the description of such embodiments be considered exhaustive or limit the invention to the precise forms disclosed. Furthermore, elements from one embodiment can be readily recombined with elements from one or more other embodiments. Such combinations can form a number of embodiments within the scope of the invention, It is intended that the scope of the invention be defined by the following claims and their equivalents. TABLE 1 Incyte Poly- Poly- Incyte Polypeptide Incyte nucleotide nucleotide Project ID SEQ ID NO: Polypeptide ID SEQ ID NO: ID 7461789 1 7461789CD1 24 7461789CB1 1210450 2 1210450CD1 25 1210450CB1 427539 3 427539CD1 26 427539CB1 1545043 4 1545043CD1 27 1545043CB1 7488231 5 7488231CD1 28 7488231CB1 1910008 6 1910008CD1 29 1910008CB1 5151459 7 5151459CD1 30 5151459CB1 55140256 8 55140256CD1 31 55140256CB1 2744344 9 2744344CD1 32 2744344CB1 1555147 10 1555147CD1 33 1555147CB1 1939136 11 1939136CD1 34 1939136CB1 5956978 12 5956978CD1 35 5956978CB1 7662817 13 7662817CD1 36 7662817CB1 55139221 14 55139221CD1 37 55139221CB1 7493736 15 7493736CD1 38 7493736CB1 4614878 16 4614878CD1 39 4614878CB1 7498437 17 7498437CD1 40 7498437CB1 3097848 18 3097848CD1 41 3097848CB1 2957789 19 2957789CD1 42 2957789CB1 5922849 20 5922849CD1 43 5922849CB1 7472828 21 7472828CD1 44 7472828CB1 8088595 22 8088595CD1 45 8088595CB1 7488478 23 7488478CD1 46 7488478CB1

[0443] TABLE 2 Polypeptide Incyte Probability SEQ ID NO: Polypeptide ID GenBank ID NO: Score Annotation 1 7461789CD1 g393095 1.4E−28 [Homo sapiens] guanine nucleotide regulatory protein Tan, E. C. et al. (1993) J. Biol. Chem. 268: 27291-27298. 2 1210450CD1 g9837381 3.4E−15 [Mus musculus] retinitis pigmentosa GTPase regulator Vervoort, R. et al. (2000) Nat. Genet. 25: 462-464. 3 427539CD1 g13186114 2.2E−25 [Homo sapiens] rab interacting lysosomal protein Cantalupo, G. et al. (2001) EMBO J. 20: 683-693. 4 1545043CD1 g577428 0.0 [Rattus norvegicus] Ca2+-dependent activator protein; calcium- dependent actin-binding protein Walent, J. H. et al. (1992) Cell 70: 765-775. Ann, K. et al. (1997) J. Biol. Chem. 272: 19637-19640. 5 7488231CD1 g1262329 2.0E−177 [Homo sapiens] reticulocalbin Ozawa, M. (1995) J. Biochem. 117: 1113-1119. 6 1910008CD1 g3822553 1.1E−215 [Gallus gallus] nuclear calmodulin-binding protein Lodge, A. P. et al. (1999) Eur. J. Biochem. 261: 137-147. 7 5151459CD1 g607003 1.7E−17 [Podospora anserina] beta transducin-like protein Saupe, S. et al. (1995) Gene 162: 135-139. g6069583 6.2E−22 [Mus musculus] JNK-binding protein JNKBP1 8 55140256CD1 g536844 1.5E−172 [Homo sapiens] ras GTPase-activating-like protein Weissback, L. et al. (1994) J. Biol. Chem. 269: 17-21. 9 2744344CD1 g3687833 2.2E−12 [Xenopus laevis] notchless Royet, J. et al. (1998) EMBO J. 17: 7351-7360. 10 1555147CD1 g12853854 0.0 [3′ incom][Mus musculus] WD domain, G-beta repeat containing protein (data source: Pfam; source key: PF00400) g9858154 3.7E−24 [Homo sapiens] tubby super-family protein Santagata, S. et al. (2001) Science 292: 2041-2050. 11 1939136CD1 g9957544 6.1E−272 [Homo sapiens] septin 2 Wenderfer, S. E. et al. (2000) Genomics 63: 354-373. 12 5956978CD1 g9755425 0.0 [Mus sp.] Dbs (homolog of Dbl guanine nucleotide exchange factor) Whitehead, I. et al. (1995) Oncogene 10: 713-721. 13 7662817CD1 g437985 1.5E−108 [Canis familiaris] Rab12 protein 14 55139221CD1 g11862939 0.0 [Mus musculus] DDM36 15 7493736CD1 g12744923 8.0E−112 [Mus musculus] Ras association domain family 3 protein 16 4614878CD1 g6708478 0.0 [Mus musculus] formin-like protein Yayoshi-Yamamoto, S. et al. (2000) Mol. Cell. Biol. 20: 6872-6881. 17 7498437CD1 g12855722 1.0E−134 [3′ incom][Mus musculus] ADP-ribosylation factor family containing protein (data source: Pfam; source key: PF00025) g3009501 7.5E−29 [Homo sapiens] ADP-ribosylation factor-like protein 2 Clark, J. et al. (1993) Proc. Natl. Acad. Sci. USA 90: 8952-8956. 18 3097848CD1 g193444 5.1E−220 [Mus musculus] guanylate binding protein Wynn, T. A. et al. (1992) J. Immunol. 147: 4384-4392. 19 2957789CD1 g20514209 0.0 [fl][Homo sapiens] (AF480466) Rho-GTPase activating protein 10 g437181 9.2E−60 [Caenorhabditis elegans] GTPase-activating protein Chen, W. et al. (1994) J. Biol. Chem. 269: 820-823. 20 5922849CD1 g6164867 3.6E−25 [Homo sapiens] G-protein gamma 8 subunit Hurowitz, E. H. et al. (2000) DNARes. 7: 111-120. 21 7472828CD1 g3406749 2.3E−15 [Homo sapiens] B cell linker protein BLNK Fu, C. and A. C. Chan (1997) J. Biol. Chem. 272: 27362-27368. Fu, C. et al. (1998) Immunity 9: 93-103. 22 8088595CD1 g7650188 1.1E−33 [Homo sapiens] soluble adenylyl cyclase Buck, J. et al. (1999) Proc. Natl. Acad. Sci. USA 96: 79-84. 23 7488478CD1 g18034100 1.0E−159 [fl][Mus musculus] ankyrin repeat domain-containing SOCS box protein Asb-12 Kile, B. T. et al. (2000) Gene 258: 31-41.

[0444] TABLE 3 Incyte Amino SEQ Poly- Acid Potential Potential Analytical ID peptide Resi- Phosphorylation Glycosylation Signature Sequences, Methods and NO: ID dues Sites Sites Domains and Motifs Databases 1 7461789CD1 1354 S121 S148 Y1182 S286 N504 N507 N563 Signal peptide: M1-A41 SPSCAN S329 T1162 S418 S445 N649 N673 N739 T1193 S493 S548 N1223 N1285 T1198 S568 S670 N1351 T1113 S675 S682 S1126 S690 S700 S1148 S704 S769 S1169 S771 S775 S1225 S782 S814 S1245 S827 Y500 S1254 T180 T273 S1287 T347 T406 S1319 T455 T594 T685 T790 T880 T884 T912 T953 RhoGAP domain: P992-T1155 HMMER- PFAM GTPase-activator protein BLIMPS- PF00620: D1045-P1061 PFAM GTPASE DOMAIN ACTIVATION BLIMPS- PD00930: P992-G1017, L1106-L1146 PRODOM GTPASE DOMAIN SH2 ACTIVATION BLAST- ZINC 3 KINASE PHOSPHATIDYL- PRODOM INOSITOL REGULATORY PD000780: M990-S1154 PH DOMAIN BLAST- DM00470|A49307|566-842: I973-V1189, DOMO E527-R577 DM00470|P11274|973-1254: I948-L1186 DM00470|P52757|241-463: D966-T1155 DM00470|P15882|109-331: D966-L1147 2 1210450CD1 200 S30 S72 S121 S125 GLYCOPROTEIN BLAST- S135 T42 T114 Y199 DM06164|Q01033|1-635: K7-E152 DOMO 3 427539CD1 403 S13 S138 S145 S156 N136 N143 COILED-COIL CHAIN MYOSIN BLAST- S242 S275 S277 T23 REPEAT HEAVY ATP-BINDING PRODOM T213 T295 T370 T386 FILAMENT HEPTAD PD000002: E76-K281 4 1545043CD1 1255 S5 S6 S7 S29 S148 S156 N453 N657 N811 signal_cleavage: M1-P34 SPSCAN S169 S187 S357 S454 N908 N958 S524 S572 S592 S734 S750 S777 S1047 S1068 S1077 S1091 S1101 S1169 S1234 S1249 T411 T601 T883 T962 T1012 T1116 T1178 T1200 T1227 T1245 Y305 Y1157 Y1166 PH domain: M488-G590 HMMER_(—) PFAM CALCIUM-DEPENDENT ACTIN- BLAST_(—) BINDING PROTEIN ACTIVATOR PRODOM FOR SECRETION PD023339: M1036-E1254 PD150606: R47-R726, G551-E822, I821-T1035 5 7488231CD1 346 S80 S151 S249 S262 N53 signal_cleavage: M1-A29 SPSCAN S307 T32 T76 T91 T115 T178 T202 T226 T316 Signal peptide: M1-L22, G7-A23, R8-A23, HMMER M1-R25, M1-A29, G7-A29 EF hand: R98-V126, V222-H250, HMMER_(—) N134-Y162, E299-F327, E263-Q291, PFAM R185-E213 EF-hand calcium-binding domain BLIMPS_(—) BL00018: D231-Y243 BLOCKS RETICULOCALBIN PRECURSOR BLAST_(—) CALCIUM-BINDING ENDOPLASMIC PRODOM RETICULUM SIGNAL GLYCOPROTEIN REPEAT DNA SUPERCOILING PD093826: R264-E319 PD007440: L18-R189 PD008339: A193-E263 PD021959: R264-L346 CALCIUM ERC-55 EF-HAND BLAST_(—) TARGETING DOMO DM03984|Q05186|1-324: G11-L346, M1-Y142 DM03984|A57516|1-322: L14-E345 DM03984|I37371|1-317: K81-E345 EF-hand calcium-binding domain: MOTIFS D107-L119, D143-Y155, D231-Y243, D272-I284 6 1910008CD1 734 S128 S161 S185 S193 N595 N653 N713 SPRY domain: K289-E418 HMMER_(—) S271 S393 S463 S556 PFAM S597 S636 T354 T448 T529 T609 T640 Y481 Y673 Y702 RIBONUCLEOPROTEIN HETERO- BLAST_(—) GENOUS NUCLEAR U PROTEIN PRODOM SCAFFOLD ATTACHMENT FACTOR A HNRNP PD150898: E419-R639 PD150846: K187-S323 PD031370: M1-A118 E1B 55 KDA-ASSOCIATED BLAST_(—) PROTEIN PRODOM PD177610: M1-E45 ATP/GTP-binding site motif MOTIFS A (P-loop): G459-T466 7 5151459CD1 636 S120 S318 S347 S353 N66 N160 N179 WD domain, G-beta repeat: HMMER_(—) S379 S536 T181 T225 N390 N605 L469-D505, V595-K631, PFAM T297 T421 T441 T466 R554-D589, L103-S141, T538 M425-Q461, L342-H376, E56-D97, R511-E546, P152-E186 Trp-Asp (WD-40) repeats signature: PROFILE- A438-V485 SCAN Beta G-protein (transducin) signature BLIMPS_(—) PR00319: L60-G76, C492-L506 PRINTS Regulator of chromosome MOTIFS condensation (RCC1) signature 2: V577-V587 Trp-Asp (WD) repeats signature: MOTIFS M173-L187, C492-L506 8 55140256CD1 593 S291 S425 S426 S447 N65 N128 N289 Signal peptide: M54-A80, M54-A92 HMMER S534 S562 S576 T24 N407 N511 T133 T338 T350 T409 T553 Y213 Y383 signal_cleavage: M54-A92 SPSCAN IQ calmodulin-binding motif: HMMER_(—) V248-R268, L218-L238, L188-L208, PFAM V158-A178 GTPase-activator protein for HMMER_(—) Ras-like GTPase: PFAM L436-G593 PROTEIN GTPASE ACTIVATION BLAST_(—) RAS GTPASE ACTIVATING-LIKE PRODOM GTPASE ACTIVATING IQ GAP1 P195 KIAA0051 CALMODULIN-BINDING PD008641: V271-0435 RAS GTPASE ACTIVATING BLAST_(—) PROTEINS DOMO DM07432|P46940|832-1116: F244-T528 DM07432|P33277|12-265: F295-L523 9 2744344CD1 1508 S85 S118 S127 S182 N142 N332 N1298 WD domain, G-beta repeat: HMMER_(—) S316 S388 S411 S458 A326-D365, V101-C136, PFAM S531 S543 S592 S620 I532-Q564, I277-D312, S691 S700 S713 S735 P462-N511, P631-D668, S765 S778 S811 S859 V56-D95, V187-S255, S1067 S1219 S1322 P674-L711, V418-D455, S1331 S1339 S1361 S144-D180, C371-N408, S1370 S1430 S1451 T64 I571-N613 T84 T90 T219 T250 T280 T322 T345 T355 T616 T627 T639 T748 T853 T872 T897 T938 T1031 T1070 T1150 T1326 T1355 T1373 T1380 T1435 T1439 T1498 Beta G-protein (transducin) signature BLIMPS_(—) PR00319: L635-H651, L655-A669 PRINTS Trp-Asp (WD) repeat proteins BLIMPS_(—) BL00678: T84-W94 BLOCKS Trp-Asp (WD) repeats signature: MOTIFS C82-V96, L395-T409, L655-A669 10 1555147CD1 404 S6 S10 S104 T45 T78 N60 N64 N114 WD domain, G-beta repeat: HMMER_(—) T87 T88 T298 T349 N200 S6-K42, M63-M99, R148-D184, PFAM T381 T390 E107-V141, H295-N330 COSMID C54G7 G-PROTEIN BETA BLAST_(—) WD-40 REPEATS PRODOM PD148603: T216-L403 SUPER-FAMILY TUBBY M01D7.3 BLAST_(—) CG2069 C54G7 CG5586 COSMID PRODOM PD043463: M63-D153 11 1939136CD1 567 S9 S34 S94 S113 S159 N150 N513 N542 Cell division protein: Q125-D398 HMMER- S200 S225 S247 S276 PFAM S326 S397 S405 S528 T3 T88 T151 T184 T255 T303 T439 T561 GTP-BINDING PROTEIN CELL BLAST- DIVISION SEPTIN HOMOLOGY PRODOM CONTROL CYCLE BRAIN H5 PD002565: Q125-S401 SEPTIN 2 CELL DIVISION BLAST- GTP-BINDING PRODOM PD131893: F394-K486 HCDC10 PEANUT BLAST- DM00875|JC2352|12-207: V109-P302 DOMO DM00875|P42207|16-212: V109-P302 DM00875|P39826|16-215: V109-D304 DM00875|P28661|125-322: V109-P302 ATP/GTP-binding site motif A (P-loop): MOTIFS G135-S142 12 5956978CD1 1120 S150 S171 S226 S321 N357 N618 N729 Signal peptide: M1-A24, M1-P26 HMMER S325 S398 S423 S455 N836 N1067 S494 S569 S580 S616 S619 S647 S916 S1060 S1061 S1069 T123 T202 T259 T302 T403 T532 T669 T845 Y503 Y751 Y943 PH domain: L857-T972 HMMER- PFAM RhoGEF domain: V662-D837 HMMER- PFAM Spectrin repeat: L378-K483 HMMER- PFAM Guanine-nucleotide dissociation BLIMPS- stimulators CDC24 family BLOCKS signatures BL00741: E776-L785, L787-L809 PROTO-ONCOGENE MCF2 GUANINE BLAST- NUCLEOTIDE RELEASING FACTOR PRODOM TRANSFORMING PROTEIN DBL PD023919: L394-H661 GUANINE NUCLEOTIDE EXCHANGE BLAST- FACTOR DBS DBL BIG SISTER PRODOM MCF2 TRANSFORMING PROTEIN SH3 DOMAIN PROTO-ONCOGENE PD115494: L975-W1120 GUANINE NUCLEOTIDE EXCHANGE BLAST- FACTOR TRANSFORMING PROTEIN PRODOM PROTO-ONCOGENE DBL MCF2 PD006893: F183-V390 GUANINE NUCLEOTIDE EXCHANGE BLAST- FACTOR TRIO ALTERNATIVE PRODOM SPLICING PROTO-ONCOGENE MCF2 PD038091: M839-Q974 DBL ONCOGENE TRANSFORMING BLAST- DM08582|S51620|292-780: P509-P996 DOMO DM08582|P10911|349-835: V510-Q974 DM05391|S51620|1-290: M217-Q508 DM05391|P10911|54-347: K218-Q508 Guanine-nucleotide dissociation stimulators MOTIFS CDC24 family signature: L787-S812 13 7662817CD1 244 S57 S114 S186 T66 signal_cleavage: M1-G27 SPSCAN T125 T129 T161 Ras family: Q44-C244 HMMER- PFAM ADP-ribosylation factors family proteins BLIMPS- BL01019: V77-K116, L120-A174 BLOCKS GTP-binding nuclear protein ran proteins BLIMPS- BL01115: L43-L86, Q176-K206 BLOCKS Transforming protein P21 RAS signature BLIMPS- PR00449: V84-S106, E146-C159, F182-I204, PRINTS L43-D64, T66-K82 GTP-BINDING LIPOPROTEIN BLAST- PRENYLATION TRANSPORT PRODOM RAS-RELATED ADP- RIBOSYLATION SUBUNIT PD000015: F41-T166, T161-N215 RAS TRANSFORMING PROTEIN BLAST- DM00006|S40207|3-148: A39-E184 DOMO DM00006|P24409|6-151: D40-E184 DM00006|P17609|6-451: D40-A185 DM00006|P22127|6-151: D40-E184 ATP/GTP-binding site motif A (P-loop): MOTIFS G49-T56 14 55139221CD1 1251 S107 S141 S218 S239 N56 N90 N102 Signal peptide: M1-G24 SPSCAN S804 S805 S837 S1035 N118 N157 N253 S1058 S1063 S1122 N484 N583 N626 S1210 T32 T78 T153 N762 N783 N795 T159 T182 T352 T498 N889 T523 T736 T778 T785 T818 T843 T864 T891 T943 T1223 Y478 Signal peptide: M1-G24, M1-L28 HMMER Fibronectin type III domain: HMMER- P751-G836, P631-A730, P430-S516, PFAM P848-S936, P528-S614 Immunoglobulin domain: HMMER- G259-A315, N157-A215, A350-A408, PFAM E50-A123 Receptor tyrosine kinase class V proteins BLIMPS- BL00790: K317-1338, T864-N889, BLOCKS G913-T943 IMMUNOGLOBULIN BLAST- DM00001|P28685|325-407 :S351-V423 DOMO DM00001|Q02246|324-412: E343-V423 DM00001|P22063|326-414: E343-V423 DM00001|A39712|569-655: L139-S218 Cell attachment sequence: MOTIFS R3-D5, R684-D686 ATP/GTP-binding site motif A (P-loop): MOTIFS A729-T736 15 7493736CD1 238 S6 S7 S50 S99 S159 Ras association (RalGDS/AF-6) domain: HMMER- S194 T18 T66 T120 S96-I187 PFAM T221 Y48 Y154 PUTATIVE TUMOR SUPPRESSOR BLAST- MAXP1 RAS EFFECTOR NORE1 PRODOM PHORBOL ESTER PD150733: W190-L231 16 4614878CD1 1082 S107 S150 S154 S165 N395 N635 N756 Formin Homology 2 Domain: HMMER_(—) S186 S199 S210 S279 N828 N1057 I611-D1048 PFAM S324 S355 S400 S460 S673 S683 S758 S768 S806 S810 S829 S830 S1013 S1016 T86 T202 T353 T406 T444 T474 T687 T700 T707 T737 T852 T857 T908 T913 PROTEIN DEVELOPMENTAL BLAST_(—) FORMIN LIMB DEFORMITY PRODOM NUCLEAR ALTERNATIVE SPLICING CELL DIAPHANOUS PD003542: 1790-R1036 FORMIN BLAST_(—) DM04565|Q05859|5-1205: I414-K940, DOMO P519-E1022 DM04565|Q05860|176-1467: I414-K987 DM04565|Q05858|1-1212: E391-E955, P510-E1022 REGULATORY BLAST_(—) DM05091|S54986|1-980: K351-A574, DOMO A492-A589, D532-S597, P580-E900, D327-E394, M914-I1050 17 7498437CD1 428 S98 S99 S121 S178 T23 ADP-ribosylation factor family: M5-D193 HMMER_(—) T37 T68 T334 T348 PFAM T385 ADP-ribosylation factors family proteins BLIMPS_(—) BL01019: P51-Y90, V95-L149, E155-K180 BLOCKS GTP-binding SAR1 protein signature BLIMPS_(—) PR00328: P51-G75, I78-R103, K123-I144, PRINTS T23-P46 RAS TRANSFORMING PROTEIN BLAST_(—) DM00006|A48259|13-176: R20-L187 DOMO DM00006|P36405|14-177: P18-L186 DM00006|Q06849|13-176: R20-G170 DM00006|S62420|13-176: R20-L186 ATP/GTP-binding site motif A (P-loop): MOTIFS G28-T35 18 3097848CD1 633 S157 S253 S371 S477 N90 N111 Guanylate-binding protein, N-terminal HMMER_(—) S522 T49 T179 T195 domain: PFAM T280 T298 T348 T482 K6-L281 T552 T584 T585 Y446 Guanylate-binding protein, C-terminal HMMER_(—) domain: PFAM E283-K579 PROTEIN BINDING INTERFERON- BLAST_(—) INDUCED GUANYLATE-BINDING PRODOM GUANINE NUCLEOTIDE MULTIGENE PD010106: M1-K433 MACROPHAGE ACTIVATION 2 BLAST_(—) GUANYLATE-BINDING PROTEIN PRODOM PD184314: L434-T482 GTP NP_BIND: BLAST_(—) DM04725|P32456|1-590: P10-I575 DOMO DM04725|P32455|1-591: M1-K579 DM04725|Q01514|1-588: M1-M581 ATP/GTP-binding site motif A (P-loop): MOTIFS G45-S52 19 2957789CD1 1958 S10 S36 S38 S57 S79 N268 N290 N350 PDZ domain (Also known as DHR HMMER_(—) S221 S253 S270 S278 N431 N557 N1313 or GLGF): PFAM S286 S295 S310 S360 N1452 N1477 T50-K158 S414 S421 S441 S518 N1579 N1928 S552 S573 S595 S625 N1932 S639 S645 S671 S689 S703 S710 S717 S743 S792 S874 S881 S896 S902 S911 S914 S920 S924 S954 S983 S999 S1014 S1055 S1066 S1080 S1099 S1104 S1110 S1194 S1288 S1355 S1378 S1390 S1399 S1418 S1431 S1432 S1433 S1458 S1462 S1478 S1518 S1527 S1537 S1562 S1601 S1610 S1614 S1637 S1650 S1664 S1665 S1669 S1675 S1683 S1687 S1694 S1704 S1708 S1717 S1754 S1797 S1843 S1848 S1861 S1866 S1879 S1950 T4 T52 T63 T149 T391 PH (pleckstrin homology) domain: HMMER_(—) T407 T442 T525 T532 A932-N1040 PFAM T679 T765 T876 T945 T981 T1003 T1012 T1034 T1049 T1074 T1122 T1140 T1157 T1258 T1273 T1283 T1309 T1350 T1454 T1479 T1485 T1486 T1498 T1542 T1567 T1597 T1635 T1750 T1760 T1792 T1803 T1906 Y303 RhoGAP domain: P1162-T1315 HMMER_(—) PFAM Spectrin pleckstrin homology BLIMPS_(—) domain signature PRINTS PR00683: R956-R977, I996-T1013, D1015-K1033 GTPASE DOMAIN ACTIVATION BLIMPS_(—) PD00930: P1162-G1187, L1266-L1306 PRODOM GTPASE DOMAIN SH2 ACTIVATION BLAST_(—) ZINC 3 KINASE SH3 PHOSPHA- PRODOM TIDYLINOSITOL REGULATORY PD000780: I1161-D1312 PH DOMAIN BLAST_(—) DM00470|P34588|1-285: G1146-W1337, DOMO T669-K691 DM00470|P15882|109-331: A1155-D1336 DM00470|Q03070|63-292: A1155-D1336 DM00470|P52757|241-463: A1155-D1336 ATP/GTP-binding site motif A (P-loop): MOTIFS G129-T136 20 5922849CD1 63 signal_cleavage: M1-A37 SPSCAN GGL domain: I8-L63 HMMER_(—) PFAM Small, acid-soluble spore proteins, PROFILE- alpha/beta type, signatures: SCAN M1-P52 G-protein gamma subunit BLIMPS_(—) BR50058: M5-P52 BLOCKS Gamma G-protein (transducin) BLIMPS_(—) signature PRINTS PR00321: L20-E34, D40-R57 GUANINE NUCLEOTIDE-BINDING BLAST_(—) PROTEIN SUBUNIT TRANSDUCER PRODOM PRENYLATION LIPOPROTEIN MULTIGENE FAMILY GI/GS/GO PD003783: A6-L63 GTP-BINDING REGULATORY PROTEIN BLAST_(—) GAMMA CHAIN: DOMO DM01133|A56181|1-70: M1-L62 DM01133|P16874|1-70: S2-L63 DM01133|I39157|1-75: M1-C60 DM01133|JC4340|1-75: N3-L62 21 7472828CD1 344 S68 S165 S239 S250 N225 Src homology domain: HMMER_(—) T78 T113 T258 Y181 E253-F318, L42-V52 PFAM SRC HOMOLOGY 2 (SH2) DOMAIN BLAST_(—) DM00048|A56110|416-529: E253-L339 DOMO 22 8088595CD1 224 S38 S89 T21 T160 N115 signal_cleavage: M1-R64 SPSCAN Adenylate and Guanylate cyclase HMMER_(—) catalytic site: PFAM G52-M80 ATP/GTP-binding site motif A (P-loop): MOTIFS G189-S196 23 7488478CD1 309 S252 T22 T216 T251 N37 N201 Ank repeat: V63-V95, K96-Y128, HMMER_(—) T267 S171-T203, N129-K161, R213-L246 PFAM

[0445] TABLE 4 Polynucleotide SEQ ID NO:/ Incyte ID/ Sequence Length Sequence Fragments 24/ 1-4065, 542-860, 671-901, 989-1394, 1080-1721, 1081-1355, 1226-1693, 1294-1824, 7461789- 1294-1921, 1362-1921, 1458-1807, 1569- 1870, 1922-2097, 3263-3864, 3263-3929, CB1/4184 3563-3813, 3563-4041, 3565-3867, 3586-4184, 3696-3908, 3696-3909, 3817-4066, 3817-4184, 3854-4184 25/ 1-691, 21-456, 23-589, 53-730, 55-698, 58-622, 91-746, 115-563, 163-422, 1210450- 174-421, 178-414, 213-440, 213-632, 230-753, 237- 491, 241-415, 246-564, CB1/1114 246-565, 253-460, 253-479, 253-702, 253-721, 253-751, 253-807, 253-822, 270-519, 270-530, 275-762, 278-927, 300-699, 305-563, 313-783, 314-506, 317-843, 320-899, 321-597, 323-967, 324-513, 326-578, 342-1005, 357-537, 359-603, 360-713, 370-609, 371-784, 377-662, 384-638, 386-640, 386-669, 398-686, 402-691, 422-629, 422-796, 430-643, 432-1030, 439-1045, 441-490, 441-692, 450-497, 452-699, 452-703, 457-699, 458-666, 472-726, 483-769, 485-717, 489-750, 491-707, 491- 709, 497-719, 497-726, 503-713, 506-1036, 515-779, 524-781, 533-776, 543-956, 544-743, 545-1049, 546-779, 550-657, 551- 796, 551-820, 559-1049, 573-727, 578-782, 591-1049, 598-944, 602-860, 612-900, 614-772, 624-1049, 637-965, 652-899, 653-1049, 654-886, 654-1037, 658-1028, 670-935, 692-933, 695-851, 724-993, 729-947, 741-999, 756-1044, 759-1038, 759-1066, 759-1094, 781-858, 813-1114, 856-1076, 892-1048 26/ 1-240, 27-259, 51-302, 51-309, 51-557, 56-526, 69-321, 130-395, 173-530, 427539- 198-858, 337-574, 376-527, 376-741, 376-758, 376- 769, 376-800, 376-805, CB1/1625 376-838, 376-899, 391-526, 412-656, 461-723, 461-933, 468-825, 469-1065, 489-773, 544-849, 616-912, 619-925, 636-836, 636-886, 636-1216, 657-951, 660-901, 709-916, 757-1011, 787-1037, 787-1307, 787-1374, 845-1392, 916-1222, 918-1190, 934-1607, 990-1597, 1004-1602, 1019-1625, 1040-1278, 1050-1613, 1072-1323, 1083-1599, 1084-1333, 1084- 1336, 1084-1608, 1084-1613, 1110-1331, 1128-1403, 1134-1612, 1139-1612, 1169-1413, 1169-1625, 1216-1386, 1216-1625, 1252-1503, 1299-1527, 1299-1625, 1337-1607, 1428-1613 27/ 1-323, 1-473, 24-347, 24-474, 24-500, 40-529, 41-505, 107-445, 271-637, 1545043- 280-637, 410-561, 535-636, 538-968, 557-892, 596-858, 596-1082, 596-1353, CB1/4713 650-1323, 680-1104, 680-1333, 799-1044, 816-3955, 954-1651, 1038-1561, 1039-1325, 1161-1661, 1241-1922, 1264-1925, 1266-1653, 1286-1560, 1286-1697, 1286-1796, 1328-2022, 1409-1810, 1559-2218, 1643-1850, 1683-1889, 1683-2305, 1703-2220, 1729-2220, 1751-2382, 1752-2019, 1764-2181, 1798-2461, 1877-2565, 1901-2568, 1916-2542, 1929-2475, 1960-2372, 1962-2450, 1962-2627, 1965-2198, 1965-2388, 1965-2403, 1965-2430, 1965-2468, 1966-2597, 1975-2519, 1990-2622, 1991-2522, 2037-2653, 2039-2515, 2044-2704, 2048-2722, 2064-2724, 2070-2700, 2082-2330, 2185-2842, 2193-2829, 2202-2871, 2245-2945, 2315-2557, 2328-2625, 2340-2980, 2395-3093, 2416-3127, 2436-2935, 2493-3010, 2504-3130, 2509-3169, 2511-3169, 2519-2793, 2558-3105, 2577-3272, 2578-2826, 2578-2827, 2588-3211, 2594-3227, 2594-3270, 2653-3260, 2656-3284, 2659-3285, 2666-3266, 2694-3031, 2694-3095, 2694-3257, 2694-3273, 2713-3243, 2713-3332, 2718-3008, 2728-2979, 2728-3409, 2736-3335, 2745-3262, 2761-2999, 2774-3391, 2786-3017, 2791-3490, 2792-3448, 2805-3447, 2819-3484, 2837-3471, 2842-3055, 2856-3532, 2865-3385, 2900-3294, 2918-3449, 2927-3167, 2927-3338, 2941-3345, 2954-3486, 2956-3591, 2970-3451, 2984-3582, 2997-3546, 3011-3179, 3021-3511, 3031-3187, 3031-3643, 3034-3382, 3040-3369, 3043-3226, 3068-3677, 3072-3510, 3092-3569, 3096-3381, 3099-3373, 3099-3483, 3123-3674, 3128-3564, 3213-3740, 3233-3507, 3246-3663, 3247-3466, 3254-3760, 3311-3574, 3327-3898, 3370-3645, 3382-3879, 3390-4000, 3398-4036, 3413-3923, 3416-3998, 3435-3772, 3439-3636, 3440-3676, 3447-4070, 3455-4171, 3494-4019, 3499-3776, 3503-3771, 3521-3977, 3544-4025, 3567-3827, 3571-4227, 3580-4253, 3630-4257, 3639-4436, 3647-3863, 3648-4270, 3651-4025, 3659-4173, 3660-4161, 3665-3856, 3673-3924, 3678-3774, 3679-3923, 3679-3933, 3679-4238, 3681-3913, 3681-4289, 3690-4241, 3691-4208, 3699-3979, 3700-3944, 3701-3958, 3701-3969, 3701-4273, 3766-4011, 3813-4269, 3877-4466, 3883-4450, 3895-4217, 3902-4591, 3926-4479, 3976-4228, 3996-4247, 4012-4632, 4064-4642, 4084-4693, 4086-4490, 4086-4588, 4094-4488, 4094-4490, 4120-4699, 4144-4395, 4172-4642, 4177-4488, 4177-4642, 4188-4641, 4190-4641, 4209-4639, 4209-4642, 4218-4640, 4234-4502, 4244-4579, 4260-4706, 4299-4640, 4302-4713, 4307-4638, 4314-4713, 4340-4640, 4348-4713, 4364-4642, 4434-4642 28/ 1-330, 1-728, 4-652, 6-330, 12-330, 32-330, 39-330, 42-330, 44-330, 7488231- 46-330, 61-330, 70-330, 71-330, 72-330, 101-330, 103-330, 107-330, 109-330, CB1/2571 110-330, 111-330, 112-330, 112-754, 119-330, 124-330, 127-330, 138-330, 148-330, 235-330, 247-326, 247-330, 276-330, 277-330, 278-328, 278-330, 281-330, 285-330, 287-325, 289-330, 291-330, 292-330, 323-375, 323-484, 323-488, 323-489, 323-493, 323-505, 323-515, 323-518, 323-521, 323-529, 323-533, 323-536, 323-541, 323-542, 323-543, 323-550, 323-555, 323-568, 323-576, 323-578, 323-579, 323-585, 323-586, 323-592, 323-593, 323-594, 323-596, 323-606, 323-610, 323-612, 323-613, 323-617, 323-618, 323-621, 323-623, 323-624, 323-625, 323-626, 323-628, 323-629, 323-633, 323-635, 323-636, 323-638, 323-640, 323-641, 323-642, 323-644, 323-650, 323-651, 323-658, 323-661, 323-675, 323-679, 323-681, 323-685, 323-704, 323-712, 323-722, 323-725, 323-728, 323-729, 323-730, 323-753, 323-758, 323-763, 323-785, 323-791, 323-794, 323-796, 323-806, 323-833, 323-837, 323-844, 323-846, 323-847, 323-854, 323-865, 323-900, 323-903, 323-904, 323-920, 323-987, 323-988, 323-1001, 323-1037, 323-1039, 323-1040, 323-1044, 323-1047, 323-1053, 323-1107, 323-1142, 326-934, 330-541, 332-1226, 334-571, 336-939, 337-873, 348-1159, 349-1071, 362-681, 363-580, 363-669, 363-704, 363-706, 363-1095, 366-933, 372-985, 374-968, 377-647, 380-1108, 393-678, 397-1096, 398-1064, 416-759, 417-580, 432-1195, 440-671, 440-696, 441-668, 447-877, 453-886, 455-696, 457-1181, 465-713, 465-846, 465-1250, 738-1195, 962-1562, 1188-1799, 1331-1375, 1331-1379, 1331-1380, 1331-1384, 1331-1397, 1331-1398, 1331-1406, 1331-1409, 1331-1412, 1331-1420, 1331-1424, 1331-1427, 1331-1432, 1331-1433, 1331-1434, 1331-1441, 1331-1446, 1331-1459, 1331-1462, 1331-1467, 1331-1469, 1331-1470, 1331-1471, 1331-1476, 1331-1477, 1331-1483, 1331-1484, 1331-1485, 1331-1487, 1331-1497, 1331-1501, 1331-1503, 1331-1504, 1331-1508, 1331-1509, 1331-1512, 1331-1514, 1331-1515, 1331-1516, 1331-1517, 1331-1519, 1331-1520, 1331-1524, 1331-1526, 1331-1527, 1331-1529, 1331-1531, 1331-1532, 1331-1533, 1331-1535, 1331-1538, 1331-1541, 1331-1542, 1331-1543, 1331-1549, 1331-1552, 1331-1560, 1331-1565, 1331-1566, 1331-1569, 1331-1570, 1331-1572, 1331-1576, 1331-1581, 1331-1583, 1331-1585, 1331-1587, 1332-1559, 1332-1583, 1334-1587, 1338-1587, 1344-1587, 1346-1587, 1348-1587, 1356-1587, 1582-1703, 1582-1707, 1582-1708, 1582-1712, 1582-1725, 1582-1726, 1582-1734, 1582-1737, 1582-1740, 1582-1748, 1582-1752, 1582-1755, 1582-1760, 1582-1761, 1582-1762, 1582-1769, 1582-1774, 1582-1787, 1582-1790, 1582-1795, 1582-1797, 1582-1798, 1582-1804, 1582-1805, 1582-1811, 1582-1812, 1582-1813, 1582-1815, 1582-1825, 1582-1829, 1582-1831, 1582-1832, 1582-1836, 1582-1837, 1582-1840, 1582-1842, 1582-1843, 1582-1844, 1582-1845, 1582-1847, 1582-1848, 1582-1852, 1582-1854, 1582-1855, 1582-1857, 1582-1859, 1582-1860, 1582-1861, 1582-1863, 1582-1869, 1582-1870, 1582-1871, 1582-1877, 1582-1880, 1582-1888, 1582-1893, 1582-1894, 1582-1898, 1582-1900, 1582-1904, 1582-1923, 1582-1925, 1582-1931, 1582-1941, 1582-1944, 1582-1947, 1582-1948, 1582-1949, 1582-1971, 1582-1972, 1582-1974, 1582-2002, 1582-2008, 1582-2011, 1582-2022, 1582-2048, 1582-2052, 1582-2059, 1582-2061, 1582-2062, 1582-2079, 1582-2084, 1582-2115, 1582-2118, 1582-2119, 1582-2135, 1582-2149, 1582-2152, 1582-2202, 1582-2203, 1582-2216, 1582-2227, 1582-2238, 1582-2246, 1582-2264, 1582-2310, 1582-2571, 1585-2148, 1588-1760, 1591-2200, 1593-2183, 1596-1866, 1599-2244, 1612-1900, 1612-1995, 1616-2310, 1617-2246, 1635-1976, 1636-1799, 1651-2214, 1659-1890, 1659-1915, 1660-1887, 1666-2092, 1672-2101, 1674-1915, 1676-2310 1684-1932 1684-2061, 1684-2313, 1957-2214, 2177-2310 29/ 1-630, 1-663, 6-436, 45-143, 45-655, 59-195, 232-882, 276-672, 309-826, 1910008- 309-842, 327-717, 330-1032, 476-984, 478-1085, 549-1163, 551-1163, 747-1324, CB1/4953 757-996, 795-1043, 805-1057, 805-1302, 805-1346, 805-1486, 805-1533, 826-1427, 920-1423, 978-1574, 1008-1675, 1035-1658, 1064-1625, 1073-1554, 1073-1584, 1073-1765, 1102-1617, 1110-1560, 1151-1686, 1172-1738, 1250-1736, 1275-1930, 1296-1959, 1318-1862, 1326-1951, 1366-1866, 1385-1829, 1412-1642, 1412-1872, 1431-1949, 1470-2047, 1494-1913, 1542-2117, 1576-1830, 1581-2147, 1607-2159, 1609-2191, 1633-1900, 1636-2163, 1646-2245, 1665-2181, 1712-2235, 1749-2090, 1750-2006, 1760-1960, 1789-2428, 1792-2406, 1836-2402, 1837-2427, 1872-2404, 1873-2428, 1935-2235, 1953-2404, 1992-2252, 2120-2371, 2146-2429, 2240-2443, 2300-2728, 2440-2568, 2440-2717, 2440-2783, 2440-2790, 2440-2797, 2440-2801, 2440-2803, 2440-2804, 2440-2815, 2446-2814, 2455-2813, 2483-2809, 2483-2814, 2483-2815, 2485-2818, 2491-2799, 2495-2799, 2495-2814, 2495-2816, 2500-2796, 2504-2796, 2504-2812, 2512-2779, 2512-2816, 2516-2797, 2516-2809, 2521-2816, 2522-2752, 2522-2753, 2523-2785, 2524-2768, 2524-2814, 2526-2809, 2526-2812, 2529-2806, 2534-2816, 2536-2796, 2537-2812, 2551-3154, 2557-2813, 2557-2818, 2558-2814, 2560-2698, 2560-2818, 2561-2803, 2561-2812, 2570-3121, 2572-2803, 2574-2815, 2588-2815, 2591-2797, 2641-2818, 2654-2809, 2701-2952, 2704-3195, 2705-3271, 2752-3267, 2918-3209, 2944-3267, 2990-3258, 3014-3310, 3075-3374, 3076-3288, 3121-3417, 3177-3457, 3318-3555, 3352-3547, 3358-3619, 3391-3705, 3437-3702, 3438-4076, 3439-3664, 3439-3731, 3442-3685, 3452-3742, 3454-4130, 3457-3889, 3467-3636, 3468-3725, 3468-3985, 3474-3717, 3476-3715, 3481-3616, 3503-4091, 3540-3814, 3624-3884, 3678-3908, 3687-3878, 3705-3930, 3705-4176, 3749-4232, 3773-4023, 3855-4103, 3865-4121, 3885-4135, 3900-4163, 3902-4113, 3946-4210, 3972-4207, 4002-4238, 4015-4238, 4015-4409, 4052-4432, 4057-4306, 4059-4533, 4064-4323, 4096-4290, 4133-4382, 4133-4720, 4152-4420, 4155-4421, 4164-4413, 4174-4436, 4174-4481, 4174-4498, 4178-4672, 4192-4449, 4195-4423, 4195-4424, 4201-4446, 4208-4477, 4226-4922, 4250-4592, 4257-4918, 4267-4493, 4267-4500, 4275-4530, 4280-4494, 4280-4495, 4280-4541, 4320-4601, 4343-4612, 4356-4617, 4388-4591, 4390-4924, 4406-4597, 4414-4645, 4442-4691, 4514-4908, 4539-4745, 4603-4808, 4612-4813, 4613-4855, 4718-4953 30/ 1-221, 152-1176, 383-687, 421-979, 421-986, 1119-1379, 1119-1411, 1119-1429, 5151459- 1119-1475, 1119-1476, 1119-1500, 1119-1535, 1119-1566, 1119-1584, 1119-1595, CB1/2259 1119-1624, 1119-1626, 1119-1630, 1119-1648, 1119-1660, 1119-1664, 1119-1666, 1119-1670, 1119-1671, 1119-1672, 1119-1674, 1119-1684, 1119-1696, 1124-1744, 1184-1715, 1222-1757, 1224-1741, 1357-1899, 1379-1658, 1381-1950, 1383-1937, 1392-1858, 1394-1538, 1402-1665, 1461-1702, 1471-2078, 1479-1740, 1479-2078, 1492-1746, 1502-1775, 1517-1932, 1566-2153, 1566-2177, 1574-2184, 1590-2141, 1611-1875, 1653-2231, 1722-2259, 1737-1932, 1763-2078, 1777-2257, 1782-2259, 1793-2249, 1800-2249, 1804-2249, 1811-2093, 1811-2259, 1841-2071, 1855-2259, 1885-2257, 1892-2249, 1896-2251 31/ 1-82, 1-789, 1-2974, 4-789, 48-82, 48-789, 61-789, 62-789, 64-789, 55140256- 100-789, 103-789, 114-789, 120-789, 138-218, 145-218, 184-218, 184-789, CB1/2974 197-789, 198-789, 199-789, 215-789, 284-789, 297-789, 904-1401, 1074-1496, 1320-2560, 1493-2140, 1493-2145, 1947-2218, 1950-2218, 2104-2558, 2176-2558, 2220-2774, 2227-2560 32/ 1-505, 23-268, 252-1028, 510-1327, 520-940, 550-1271, 832-1088, 832-1334, 2744344- 832-1364, 832-1625, 933-4676, 983-1153, 1160-1740, 1277-1567, 1300-1933, CB1/5121 1340-1954, 1346-1684, 1353-1957, 1361-1763, 1362-1621, 1432-1725, 1450-2004, 1473-1964, 1590-2297, 1649-2302, 1659-2230, 1669-2212, 1669-2238, 1669-2291, 1688-2114, 1721-2107, 1729-2192, 1733-2149, 1733-2200, 1733-2203, 1733-2223, 1733-2226, 1733-2264, 1751-2208, 1752-2305, 1753-2295, 1757-2306, 1764-2306, 1765-2181, 1780-2304, 1788-2222, 1816-2291, 1822-2295, 1824-2306, 1830-2035, 1839-2306, 1847-2385, 1848-2451, 1848-2499, 1848-2518, 1853-2305, 1853-2306, 1858-2306, 1894-2306, 1942-2292, 2044-2306, 2079-2368, 2079-2669, 2123-2792, 2127-2755, 2139-2305, 2200-2263, 2240-2960, 2510-3239, 2721-2984, 2908-3172, 2908-3176, 2908-3514, 2926-3489, 2969-3481, 2989-3560, 2989-3635, 2994-3140, 2995-3633, 2996-3140, 3041-3776, 3045-3269, 3061-3154, 3161-3390, 3163-3595, 3204-3733, 3205-3733, 3230-3829, 3239-3426, 3266-3385, 3317-3848, 3339-3738, 3339-3986, 3348-3836, 3410-4112, 3418-3930, 3422-3688, 3426-4076, 3443-3717, 3444-3994, 3488-3782, 3490-3933, 3494-3945, 3510-3722, 3510-3786, 3511-3819, 3518-3766, 3520-3715, 3520-3830, 3520-3974, 3539-4091, 3539-4224, 3549-4101, 3553-3865, 3560-4157, 3564-3815, 3584-3836, 3598-4289, 3601-3985, 3602-4113, 3616-4160, 3620-4051, 3638-4160, 3676-4265, 3680-4078, 3683-4077, 3696-4048, 3738-4167, 3751-4271, 3762-4310, 3767-4288, 3775-4341, 3794-4228, 3832-4354, 3860-4422, 3862-4474, 3873-4256, 3875-4369, 3901-4216, 3908-4460, 3921-4208, 3923-4332, 3926-4176, 3944-4218, 3980-4253, 3984-4298, 3992-4545, 4005-4603, 4006-4310, 4010-4650, 4014-4331, 4014-4535, 4025-4574, 4031-4571, 4044-4351, 4046-4607, 4056-4655, 4069-4713, 4071-4292, 4161-4458, 4161-4706, 4170-4460, 4180-4469, 4189-4564, 4189-4604, 4190-4783, 4197-4331, 4198-4439, 4198-4546, 4198-4556, 4224-4846, 4234-4846, 4234-4860, 4242-4824, 4259-4812, 4261-4787, 4265-4878, 4274-4774, 4278-4780, 4284-4535, 4286-4537, 4291-4878, 4315-4819, 4319-4621, 4326-4521, 4335-4843, 4341-4565, 4347-4935, 4362-5022, 4365-4787, 4374-4630, 4399-4687, 4429-4676, 4437-4709, 4444-4767, 4456-4609, 4456-5119, 4465-4687, 4493-4695, 4527-4931, 4556-4839, 4556-4846, 4556-5018, 4584-4869, 4588-5121, 4594-4978, 4613-4880, 4615-4863, 4621-4812, 4621-5000, 4621-5121, 4628-5034, 4635-4862 33/ 1-240, 1-417, 1-418, 1-460, 1-461, 1-465, 1-548, 1-553, 1-566, 1555147- 1-581, 1-700, 1-706, 1-1840, 1-3329, 4-414, 4-444, 4-610, 4-829, CB1/3625 7-391, 7-577, 33-425, 33-449, 33-450, 33-510, 2661-3289, 2661-3326, 2813-3097, 2901-3610, 3218-3625 34/ 1-36, 1-79, 1-105, 1-119, 1-126, 1-127, 1-132, 1-144, 1-145, 1-146, 1939136- 1-148, 1-150, 1-283, 3-148, 4-148, 36-148, 48-148, 55-140, 66-148, CB1/3988 69-148, 75-148, 84-148, 103-148, 119-148, 164-666, 461-1094, 473-735, 478-518, 478-520, 478-523, 478-538, 478-544, 478-548, 478-559, 478-609, 478-644, 478-666, 478-732, 478-748, 478-760, 478-776, 478-813, 478-826, 478-852, 478-881, 478-912, 478-937, 478-938, 478-992, 478-1077, 478-1078, 478-1094, 484-513, 484-887, 486-1094, 495-830, 508-1094, 514-1094, 516-933, 516-1029, 545-1094, 555-1094, 560-1094, 560-1095, 575-858, 575-1052, 586-1094, 589-1092, 591-996, 617-723, 618-1090, 642-664, 643-664, 675-924, 675-949, 675-968, 675-1094, 679-725, 683-1094, 688-1079, 690-1079, 696-1079, 698-1094, 700-1079, 700-1085, 702-1094, 703-971, 703-1094, 704-1079, 705-1084, 706-1079, 717-1094, 726-980, 733-1012, 738-1052, 741-1094, 746-839, 749-1094, 753-1049, 756-1094, 764-1094, 767-1094, 779-1094, 791-1094, 792-1094, 816-1094, 818-1094, 830-1094, 839-1094, 845-1094, 865-1094, 876-1094, 884-1093, 885-1094, 887-1094, 889-1094, 899-1094, 908-1094, 921-1094, 924-1094, 934-1094, 956-1091, 957-1092, 974-1094, 977-1094, 980-1095, 982-1094, 985-1094, 988-1094, 997-1094, 1007-1094, 1026-1715, 1032-1094, 1051-1094, 1056-1094, 1064-1094, 1097-1491, 1097-2306, 1105-1355, 1112-1740, 1136-1652, 1141-1421, 1224-1426, 1246-1537, 1246-1638, 1247-1491, 1247-1678, 1249-1539, 1249-1691, 1253-1659, 1286-1917, 1287-1503, 1306-1811, 1316-1544, 1339-1846, 1358-1720, 1368-1612, 1383-2079, 1384-1660, 1435-1569, 1435-1665, 1435-1679, 1435-1746, 1435-1952, 1435-1968, 1435-2073, 1435-2104, 1445-1722, 1449-2184, 1465-1725, 1477-2184, 1484-1761, 1486-1940, 1486-2018, 1522-2005, 1534-1681, 1536-2129, 1537-2130, 1569-2270, 1581-2270, 1588-2270, 1599-2269, 1602-2334, 1608-2270, 1617-2044, 1636-1750, 1652-1924, 1677-2317, 1680-1870, 1680-1927, 1680-2622, 1681-1870, 1688-2390, 1697-2022, 1710-2399, 1716-2386, 1718-2125, 1718-2221, 1747-2390, 1757-2295, 1762-2443, 1762-2508, 1777-2050, 1777-2244, 1788-2350, 1791-2284, 1793-2188, 1819-1915, 1820-2282, 1844-2141, 1845-2352, 1859-2160, 1864-2116, 1875-2286, 1880-2271, 1882-2271, 1888-2271, 1890-2343, 1892-2271, 1892-2277, 1894-2286, 1895-2163, 1895-2286, 1896-2271, 1897-2276, 1898-2271, 1909-2286, 1918-2172, 1925-2204, 1930-2244, 1933-2286, 1938-2031, 1941-2311, 1945-2241, 1948-2372, 1956-2286, 1959-2441, 1971-2286, 1983-2286, 1984-2286, 2008-2286, 2010-2286, 2022-2303, 2031-2286, 2031-2371, 2037-2290, 2057-2286, 2068-2286, 2076-2285, 2077-2286, 2079-2620, 2081-2286, 2091-2286, 2100-2286, 2113-2286, 2116-2286, 2126-2286, 2148-2283, 2149-2284, 2166-3093, 2169-2286, 2172-2620, 2174-2286, 2177-2286, 2180-2400, 2189-2286, 2189-2800, 2199-2286, 2218-2286, 2224-2286, 2243-2771, 2248-2673, 2256-2754, 2369-2768, 2450-2724, 2537-2782, 2572-3051, 2606-2846, 2621-3020, 2645-3205, 2648-3035, 2791-3109, 2867-3075, 2919-3387, 2970-3566, 3052-3371, 3053-3563, 3060-3340, 3060-3343, 3064-3343, 3068-3293, 3081-3294, 3101-3401, 3179-3459, 3182-3414, 3264-3514, 3264-3648, 3273-3724, 3277-3788, 3364-3987, 3391-3942, 3396-3639, 3409-3652, 3431-3988, 3536-3786, 3571-3811, 3571-3988, 3597-3829, 3597-3963, 3597-3988, 3634-3872, 3642-3873, 3663-3967, 3665-3897, 3741-3767 35/ 1-353, 1-355, 1-369, 177-450, 236-3625, 363-894, 449-874, 449-914, 450-923, 5956978- 451-1181, 460-757, 477-943, 486-910, 489-1134, 494-1052, 561-1105, 589-740, CB1/4169 614-633, 665-1122, 674-923, 706-1350, 765-1174, 765-1425, 788-1276, 832-1468, 983-1807, 1074-1234, 1076-1477, 1169-1802, 1171-1781, 1171-1843, 1311-1894, 1438-1802, 1561-2227, 1602-2213, 1701-2232, 1733-2306, 1786-2577, 1801-2577, 1921-2577, 2263-2570, 2275-2483, 2701-2979, 2753-3044, 2768-3324, 2791-2994, 2793-3246, 2884-3432, 3242-3625, 3511-3751, 3623-4119, 3623-4160, 3766-4167, 3834-3974, 3942-4168, 4027-4168, 4034-4155, 4052-4169, 4059-4168, 4071-4168, 4091-4167 36/ 1-2063, 613-830, 686-955, 717-1296, 717-2063, 784-1029, 784-1290, 786-1290, 7662817- 839-1290, 859-1430, 868-1276, 889-1148, 898-1654, 931-1542, 985-1247, 988-1462, CB1/2063 1038-1530 37/ 1-805, 101-172, 101-523, 101-664, 101-943, 101-1255, 1100-6382, 1746-1988 55139221- CB1/6382 38/ 1-206, 11-236, 71-773, 128-747, 233-340, 298-701, 308-1011, 430-991, 448-714, 7493736- 448-1012, 687-745, 687-960, 719-946, 724-967, 735-1369, 1016-1263 CB1/1369 39/ 1-353, 1-447, 1-466, 1-484, 1-491, 1-493, 1-549, 1-582, 1-584, 1-587, 4614878- 1-629, 1-633, 1-662, 1-666, 1-687, 1-703, 1-752, 1-765, 1-774, 1-792, CB1/3730 1-907, 2-730, 2-748, 2-764, 105-734, 113-708, 126-999, 180-786, 197-3730, 199-702, 200-765, 211-834, 256-950, 271-964, 283-478, 285-1039, 286-888, 292-766, 303-960, 369-943, 409-1309, 417-1122, 487-1310, 538-969, 563-1263, 591-872, 601-880, 618-687, 633-883, 640-1485, 649-1315, 658-1427, 695-1459, 699-1292, 715-1308, 736-1535, 758-1308, 801-1458, 831-1639, 839-1582, 847-1639, 848-1639, 850-1333, 852-1393, 852-1639, 858-1458, 858-1639, 869-1410, 871-1636, 881-1630, 881-1639, 882-1369, 891-1295, 897-1637, 917-1229, 926-1639, 928-1512, 930-1615, 932-1304, 933-1639, 935-1639, 941-1363, 942-1622, 949-1639, 950-1235, 962-1639, 967-1555, 967-1639, 970-1639, 973-1639, 980-1639, 986-1639, 993-1639, 997-1639, 1004-1639, 1006-1639, 1009-1639, 1012-1639, 1012-1650, 1016-1466, 1027-1639, 1029-1639, 1030-1639, 1031-1639, 1036-1553, 1040-1639, 1041-1639, 1045-1639, 1050-1639, 1062-1348, 1120-1639, 1122-1708, 1150-1639, 1172-1695, 1204-1639, 1206-1654, 1211-1607, 1213-1639, 1231-1625, 1249-1555, 1258-1645, 1264-1639, 1268-1639, 1281-1639, 1301-1595, 1399-1645, 1416-1951, 1508-1771, 1695-2269, 1701-1949, 1709-2305, 1710-1797, 1711-1862, 1711-2369, 1713-2201, 1738-2312, 1741-2039, 1741-2221, 1759-2044, 1759-2162, 1759-2292, 1807-2392, 1835-2259, 1871-2279, 1977-2591, 1987-2491, 2022-2468, 2226-2382, 2226-2814, 2245-2841, 2273-2436, 2373-2955, 2420-2995, 2424-3062, 2440-2678, 2453-2997, 2474-2703, 2503-2770, 2526-2784, 2538-3171, 2556-3058, 2557-3116, 2558-3405, 2584-2956, 2605-2765, 2654-3269, 2782-3398, 2804-3274, 2805-3387, 2841-3133, 2856-3141, 2858-3106, 2868-3355, 2876-3506, 2936-3582, 2938-3549, 2960-3510, 3063-3687, 3072-3317, 3082-3469, 3115-3250, 3121-3652 40/ 1-138, 1-242, 1-254, 1-492, 1-577, 2-592, 2-602, 14-631, 128-389, 7498437- 232-769, 259-587, 302-1009, 317-926, 378-1048, 380-496, 383-1048, 410-571, CB1/2740 410-620, 410-631, 411-631, 465-631, 481-1072, 485-1047, 496-1030, 555-1360, 722-1011, 722-1064, 853-1086, 855-1088, 1015-1555, 1028-1651, 1096-1661, 1096-1733, 1129-1728, 1151-1462, 1204-1862, 1346-1573, 1349-1538, 1390-1459, 1735-2031, 1844-2138, 1844-2320, 1863-1953, 1866-2425, 1877-2101, 1952-2196, 1952-2254, 1954-2153, 2065-2708, 2154-2740, 2173-2404, 2173-2706, 2200-2320, 2200-2587, 2214-2446, 2228-2731, 2229-2620, 2327-2593, 2340-2621 41/ 1-670, 5-603, 7-559, 16-597, 17-255, 35-572, 383-845, 496-1075, 554-1049, 3097848- 837-987, 931-1413, 931-1589, 1004-1541, 1012-1200, 1066-1539, 1118-1589, CB1/2087 1186-1584, 1365-2017, 1383-2018, 1404-1796, 1614-1874, 1614-2052, 1854-2087 42/ 1-627, 3-634, 9-533, 96-585, 97-485, 133-622, 253-1059, 310-602, 344-1004, 2957789- 352-897, 368-845, 375-897, 388-805, 431-897, 464-7034, 613-897, 635-743, CB1/7034 764-897, 764-1361, 766-1188, 895-1479, 898-1048, 898-1070, 898-1078, 898-1085, 898-1095, 898-1099, 898-1100, 898-1101, 898-1108, 898-1111, 898-1116, 898-1117, 898-1126, 898-1128, 898-1130, 898-1135, 898-1138, 898-1140, 898-1141, 898-1143, 898-1145, 898-1151, 898-1152, 898-1153, 898-1156, 898-1159, 898-1162, 898-1165, 898-1171, 898-1177, 898-1178, 898-1182, 898-1183, 898-1189, 898-1190, 898-1195, 898-1197, 898-1198, 898-1199, 898-1200, 898-1203, 898-1204, 898-1205, 898-1208, 898-1215, 898-1238, 898-1266, 898-1282, 898-1386, 898-1549, 898-1645, 898-1656, 904-1645, 905-1645, 909-1645, 925-1645, 945-1322, 945-1647, 955-1315, 987-1231, 1017-1271, 1066-1656, 1076-1310, 1076-1406, 1105-1515, 1108-1356, 1177-1724, 1179-1723, 1182-1787, 1279-1529, 1289-1505, 1412-1893, 1425-1893, 1523-1989, 1556-1813, 1888-2167, 1888-2434, 1912-2187, 2192-2766, 2241-2523, 2360-2909, 2390-2687, 2438-3010, 2438-3025, 2599-2825, 2696-3261, 2784-3331, 2823-3072, 2884-3402, 2999-3305, 3013-3081, 3135-3489, 3147-3385, 3147-3489, 3277-3833, 3363-3693, 3416-3681, 3434-3695, 3474-3634, 3671-4256, 3784-3932, 3948-4459, 4061-4313, 4106-4724, 4130-4718, 4220-4713, 4290-4454, 4377-4625, 4401-4866, 4401-4981, 4447-4842, 4462-4827, 4548-4819, 4612-5206, 4638-5073, 4673-4870, 4678-5019, 4678-5237, 4682-5272, 4711-5206, 4712-5372, 4733-5165, 4740-5136, 4741-5063, 4743-5025, 4753-5005, 4769-5003, 4792-5249, 4813-5348, 4816-5400, 4830-5212, 4832-5360, 4836-5401, 4842-5317, 4853-5092, 4859-5125, 4859-5414, 4859-5443, 4863-5206, 4866-5206, 4868-5206, 4869-5206, 4870-5206, 4871-5206, 4873-5206, 4879-5460, 4892-5159, 4895-5514, 4897-5439, 4899-5206, 4900-5206, 4901-5422, 4903-5206, 4907-5206, 4908-5047, 4914-5206, 4915-5206, 4915-5212, 4916-5206, 4919-5206, 4921-5206, 4922-5206, 4923-5206, 4924-5206, 4925-5206, 4926-5206, 4929-5206, 4930-5206, 4931-5206, 4932-5206, 4933-5523, 4934-5206, 4936-5514, 4936-5549, 4937-5206, 4938-5523, 4940-5206, 4940-5519, 4944-5206, 4947-5206, 4949-5273, 4957-5206, 4958-5597, 4976-5477, 4982-5331, 4984-5477, 5013-5273, 5019-5576, 5019-5584, 5031-5212, 5042-5320, 5044-5616, 5048-5481, 5056-5270, 5072-5371, 5075-5354, 5084-5666, 5091-5688, 5093-5686, 5115-5625, 5136-5686, 5138-5693, 5139-5619, 5148-5569, 5149-5739, 5155-5819, 5159-5738, 5163-5613, 5181-5794, 5189-5795, 5197-5774, 5204-5666, 5213-5428, 5213-5460, 5213-5473, 5213-5476, 5213-5480, 5213-5486, 5213-5488, 5213-5490, 5213-5492, 5213-5493, 5213-5499, 5213-5501, 5213-5508, 5213-5511, 5213-5512, 5213-5513, 5213-5518, 5213-5549, 5213-5551, 5213-5552, 5213-5562, 5213-5898, 5221-5482, 5224-5391, 5253-5830, 5265-5683, 5265-5852, 5291-5817, 5292-5911, 5312-5529, 5312-5540, 5312-5673, 5312-5769, 5324-5819, 5357-5841, 5362-5814, 5369-5647, 5378-5760, 5379-5759, 5397-6038, 5400-5523, 5400-5785, 5412-5664, 5412-5915, 5441-6023, 5442-5896, 5445-6023, 5464-5624, 5465-5748, 5468-5730, 5480-5925, 5485-5724, 5485-6042, 5509-5754, 5538-6054, 5540-5979, 5564-6168, 5568-5827, 5570-5821, 5572-5930, 5575-5865, 5581-5719, 5581-5855, 5581-5858, 5581-6095, 5612-5865, 5623-6085, 5631-5901, 5647-6173, 5648-5962, 5652-5943, 5664-6300, 5696-6261, 5713-5980, 5713-6204, 5714-6301, 5732-6102, 5746-6068, 5746-6094, 5746-6125, 5746-6170, 5752-6041, 5752-6136, 5754-5981, 5754-6235, 5782-6026, 5823-6127, 5842-6117, 5843-6265, 5860-5996, 5860-6113, 5867-6146, 5870-6161, 5879-6150, 5893-6336, 5913-6211, 5921-6224, 5961-6206, 5964-6515, 6026-6245, 6026-6284, 6031-6318, 6032-6538, 6042-6328, 6043-6460, 6072-6563, 6101-6319, 6111-6474, 6120-6410, 6120-6716, 6136-6566, 6170-6403, 6170-6680, 6172-6433, 6172-6439, 6217-6445, 6217-6479, 6233-6440, 6245-6493, 6245-6510, 6245-6735, 6269-6566, 6287-6812, 6316-6604, 6316-6879, 6330-6622, 6370-6893, 6398-6675, 6398-6925, 6405-6893, 6420-6859, 6450-7034, 6455-6921, 6456-6751, 6480-6936 43/ 1-229, 84-305 5922849- CB1/305 44/ 1-90, 1-144, 1-270, 1-317, 1-337, 1-340, 1-342, 1-355, 1-392, 1-400, 7472828- 1-432, 1-483, 1-501, 1-535, 1-649, 1-682, 1-739, 1-740, 1-742, 1-743, CB1/1373 3-742, 9-1373, 13-742, 15-743, 39-742, 41-742, 53-742, 57-486, 62-486, 62-742, 67-742, 71-742, 89-742, 94-742, 95-574, 101-631, 106-739, 106-742, 116-635, 118-743, 119-742, 122-739, 131-742, 132-664, 134-742, 146-742, 154-742, 156-742, 159-742, 160-742, 162-736, 163-742, 174-742, 177-742, 178-742, 187-739, 190-742, 191-742, 192-739, 195-742, 211-742, 213-742, 218-742, 220-735, 221-742, 231-742, 233-739, 272-742, 281-742, 287-742, 288-742, 295-742, 297-739, 302-742, 336-742, 337-742, 338-739, 338-742, 343-742, 346-739, 364-739, 369-742, 370-742, 371-742, 380-742, 384-739, 384-742, 394-742, 395-742, 402-742, 403-742, 415-742, 420-730, 427-742, 428-742, 434-742, 451-742, 454-739, 457-742, 462-742, 469-834, 484-742, 610-898, 610-1114, 610-1117, 610-1147, 610-1158, 610-1172, 610-1306, 610-1335, 618-739, 618-742, 709-1333 45/ 1-246, 1-496, 1-3108, 279-920, 782-1009, 877-1336, 877-1337, 877-1338, 8088595- 1243-1703, 1243-1753, 1243-1758, 1243-1765, 1243-1779, 1243-1784, 1243-1786, CB1/3108 1243-1787, 1243-1817, 1243-2050, 1246-1793, 1327-1507, 1510-1959, 1637-2613, 1640-2404, 1640-2632, 1752-1915, 1756-2197, 1993-2927, 2042-2927, 2057-2927, 2064-2927, 2071-2506, 2173-2927, 2174-2927, 2180-2927, 2523-3108 46/ 1-268, 1-973, 1-1197, 178-973, 364-822, 364-848, 715-962, 730-1265, 7488478- 748-1181, 788-1265, 801-1246, 909-1248, 972-1101, 1022-1220 CB1/1265

[0446] TABLE 5 Polynucleotide Representative SEQ ID NO: Incyte Project ID: Library 24 7461789CB1 SINTNON02 25 1210450CB1 BRAUNOT02 26 427539CB1 THYRNOT03 27 1545043CB1 BRAINON01 28 7488231CB1 PROTDNV09 29 1910008CB1 COLRTUE01 30 5151459CB1 BRONDIT01 31 55140256CB1 BONMTUE02 32 2744344CB1 UTRSNOT02 33 1555147CB1 ADIPTXS05 34 1939136CB1 BRAITDR03 35 5956978CB1 BRAITDR03 36 7662817CB1 LVENNOT02 38 7493736CB1 PROSUNE04 39 4614878CB1 BRAUNOR01 40 7498437CB1 LUNGTUT17 41 3097848CB1 ESOGTME01 42 2957789CB1 BRAWTDA01 43 5922849CB1 BRAIFET02 44 7472828CB1 COLNUCT03 45 8088595CB1 BLADTUN02 46 7488478CB1 UTRSDIC01

[0447] TABLE 6 Library Vector Library Description ADIPTXS05 pINCY This subtracted, pooled treated adipocyte tissue library was constructed using 2.48 million clones from a pooled treated adipocyte tissue library and was subjected to 2 rounds of subtraction hybridization with 1.33 million clones from an untreated pooled adipocyte tissue library. The starting library for subtraction was constructed using RNA isolated from pooled treated adipocytes removed from a 47-year-old female, a 38-year-old female, a 25-year-old female, a 37-year-old female, and a 35-year-old male during liposuction. The adipocytes were treated with 100 nM of human insulin. The hybridization probe for subtraction was derived from a similarly constructed untreated adipocyte tissue library using RNA isolated from a different donor pool. Subtractive hybridization conditions were based on the methodologies of Swaroop et al. (1991) Nucleic Acids Res. 19: 1954 and Bonaldo et al. (1996) Genome Res. 6: 791-806. BLADTUN02 pINCY This normalized bladder tissue and bladder tumor tissue library was constructed from 4.56 million independent clones from a pooled bladder tissue and bladder tumor tissue library. Library was constructed using pooled cDNA from two donors. cDNA was generated using mRNA isolated from bladder tumor tissue and non-tumorous bladder tissue removed from a 58-year- old Caucasian male (donor A) during a radical cystectomy, radical prostatectomy, regional lymph node excision, and urinary diversion to bowel; and from bladder tumor tissue removed from a 72-year-old Caucasian male (donor B) during a radical cystectomy and prostatectomy. For donor A, pathology for the tumor tissue indicated invasive grade 3 transitional cell carcinoma forming an ulcerated mass in the left bladder wall. The patient presented with hematuria. Patient history included a benign colon neoplasm. Previous surgeries included appendectomy. The patient was not taking any medications. Family history included benign hypertension, cerebrovascular disease, and atherosclerotic coronary artery disease in the mother. For donor B, pathology indicated an invasive grade 3 (of 3) transitional cell carcinoma, forming a mass extending into the wall of the right bladder base. Patient history included pure hypercholesterolemia and tobacco abuse. Previous surgeries included a cholecystectomy, a ligation and stripping of varicose veins, and a closed bladder biopsy. Patient medications included BCG vaccine (for treatment of bladder cancer). Family history included myocardial infarction and cerebrovascular disease in the father; brain cancer in the mother; and myocardial infarction in the sibling(s). The library was normalized in two rounds using conditions adapted from Soares et al. (1994) Proc. Natl. Acad. Sci. USA 91: 9228-9232 and Bonaldo et al. (1996) Genome Res. 6: 791-806, except that a significantly longer (48 hours/round) reannealing hybridization was used. BONMTUE02 PCDNA2.1 This 5′ biased random primed library was constructed using RNA isolated from sacral bone tumor tissue removed from an 18-year-old Caucasian female during an exploratory laparotomy with soft tissue excision. Pathology indicated giant cell tumor of the sacrum. The patient presented with pelvic joint pain, constipation, urinary incontinence, and unspecified abdominal/pelvic symptoms. Patient history included a soft tissue malignant neoplasm. Patient medication included Darvocet. Family history included prostate cancer in the grandparent(s). BRAIFET02 pINCY Library was constructed using RNA isolated from brain tissue removed from a Caucasian male fetus, who was stillborn with a hypoplastic left heart at 23 weeks of gestation. BRAINON01 PSPORT1 Library was constructed and normalized from 4.88 million independent clones from a brain tissue library. RNA was made from brain tissue removed from a 26-year-old Caucasian male during cranioplasty and excision of a cerebral meningeal lesion. Pathology for the associated tumor tissue indicated a grade 4 oligoastrocytoma in the right fronto-parietal part of the brain. The normalization and hybridization conditions were adapted from Soares et al. (1994) Proc. Natl. Acad. Sci. USA 91: 9228-9232, except that a significantly longer (48-hour) reannealing hybridization was used. BRAITDR03 PCDNA2.1 This random primed library was constructed using RNA isolated from allocortex, cingulate posterior tissue removed from a 55-year-old Caucasian female who died from cholangiocar- cinoma. Pathology indicated mild meningeal fibrosis predominately over the convexities, scattered axonal spheroids in the white matter of the cingulate cortex and the thalamus, and a few scattered neurofibrillary tangles in the entorhinal cortex and the periaqueductal gray region. Pathology for the associated tumor tissue radicated well-differentiated cholangio- carcinoma of the liver with residual or relapsed tumor. Patient history included cholangio- carcinoma, post-operative Budd-Chiari syndrome, biliary ascites, hydrothorax, dehydration, malnutrition, oliguria and acute renal failure. Previous surgeries included cholecystectomy and resection of 85% of the liver. BRAUNOR01 pINCY This random primed library was constructed using RNA isolated from striatum, globus pallidus and posterior putamen tissue removed from an 81-year-old Caucasian female who died from a hemorrhage and ruptured thoracic aorta due to atherosclerosis. Pathology indicated moderate atherosclerosis involving the internal carotids, bilaterally; microscopic infarcts of the frontal cortex and hippocampus; and scattered diffuse amyloid plaques and neurofibrillary tangles, consistent with age. Grossly, the leptomeninges showed only mild thickening and hyalinization along the superior sagittal sinus. The remainder of the leptomeninges was thin and contained some congested blood vessels. Mild atrophy was found mostly in the frontal poles and lobes, and temporal lobes, bilaterally. Microscopically, there were pairs of Alzheimer type II astrocytes within the deep layers of the neocortex. There was increased satellitosis around neurons in the deep gray matter in the middle frontal cortex. The amygdala contained rare diffuse plaques and neurofibrillary tangles. The posterior hippocampus contained a microscopic area of cystic cavitation with hemosiderin-laden macrophages surrounded by reactivegliosis. Patient history included sepsis, cholangitis, post-operative atelectasis, pneumonia CAD, cardiomegaly due to left ventricular hypertrophy, splenomegaly, arteriolonephrosclerosis, nodular colloidal goiter, emphysema, CHF, hypothyroidism, and peripheral vascular disease. BRAUNOT02 pINCY Library was constructed using RNA isolated from globus pallidus/substantia innominata tissue removed from the brain of a 35-year-old Caucasian male. No neuropathology was found. Patient history included dilated cardiomyopathy, congestive heart failure, and an enlarged spleen and liver. BRAWTDA01 PSPORT1 This amplified library was constructed using RNA isolated from dentate nucleus tissue removed from a 55-year-old Caucasian female who died from cholangiocarcinoma. Pathology indicated mild meningeal fibrosis predominately over the convexities, scattered axonal spheroids in the white matter of the cingulate cortex and the thalamus, and a few scattered neurofibrillary tangles in the entorhinal cortex and the periaqueductal gray region. Pathology for the associated tumor tissue indicated well-differentiated cholangiocarcinoma of the liver with residual or relapsed tumor. Patient history included cholangiocarcinoma, post-operative Budd-Chiari syndrome, biliary ascites, hydrothorax, dehydration, malnutrition, oliguria and acute renal failure. Previous surgeries included cholecystectomy and resection of 85% of the liver. BRONDIT01 pINCY Library was constructed using RNA isolated from right lower lobe bronchial tissue removed from a pool of 3 asthmatic Caucasian male and female donors, 22- to 51-years-old during bronchial pinch biopsies. Patient history included atopy as determined by positive skin tests to common aero-allergens. COLNUCT03 pINCY Library was constructed using RNA isolated from diseased colon tissue obtained from a 69- year-old Caucasian male during a partial colon excision with ileostomy. Pathology indicated severely active idiopathic inflammatory bowel disease most consistent with chronic ulcerative colitis. Patient history included benign neoplasm of the colon. Previous surgeries included cholecystectomy, spinal canal exploration, partial glossectomy, radical cystectomy, and bladder operation. Family history included cerebrovascular disease and benign hypertension. COLRTUE01 PSPORT1 This 5′ biased random primed library was constructed using RNA isolated from rectum tumor tissue removed from a 50-year-old Caucasian male during closed biopsy of rectum and resection of rectum. Pathology indicated grade 3 colonic adenocarcinoma which invades through the muscularis propria to involve pericolonic fat. Tubular adenoma with low grade dysplasia was also identified. The patient presented with malignant rectal neoplasm, blood in stool, and constipation. Patient history included benign neoplasm of the large bowel, hyperlipidemia, benign hypertension, alcohol abuse, and tobacco abuse. Previous surgeries included above knee amputation and vasectomy. Patient medications included allopurinol, Zantac, Darvocet, Centrum vitamins, and an unspecified stool softener. Family history included congestive heart failure in the mother; and benign neoplasm of the large bowel and polypectomy in the sibling(s). ESOGTME01 PSPORT This 5′ biased random primed library was constructed using RNA isolated from esophageal tissue removed from a 53-year-old Caucasian male during a partial esophagectomy, proximal gastrectomy, and regional lymph node biopsy. Pathology indicated no significant abnormality in the non-neoplastic esophagus. Pathology for the matched tumor tissue indicated invasive grade 4 (of 4) adenocarcinoma, forming a sessile mass situated in the lower esophagus, 2 cm from the gastroesophageal junction and 7 cm from the proximal margin. The tumor invaded through the muscularis propria into the adventitial soft tissue. Metastatic carcinoma was identified in 2 of 5 paragastric lymph nodes with perinodal extension. The patient presented with dysphagia. Patient history included membranous nephritis, hyperlipidemia, benign hypertension, and anxiety state. Previous surgeries included an adenotonsillectomy, appendectomy, and inguinal hernia repair. The patient was not taking any medications. Family history included atherosclerotic coronary artery disease, alcoholic cirrhosis, alcohol abuse, and an abdominal aortic aneurysm rupture in the father; breast cancer in the mother; a myocardial infarction and atherosclerotic coronary artery disease in the sibling(s); and myocardial infarction and atherosclerotic coronary artery disease in the grandparent(s). LUNGTUT17 pINCY Library was constructed using RNA isolated from lung tumor tissue removed from a 53-year-old male. Pathology indicated grade 4 adenocarcinoma. LVENNOT02 PSPORT1 Library was constructed using RNA isolated from the left ventricle of a 39-year-old Caucasian male, who died from a gunshot wound. PROSUNE04 pINCY This 5′ biased random primed library was constructed using RNA isolated from an untreated LNCaP cell line, derived from prostate carcinoma with metastasis to the left supraclavicular lymph nodes, removed from a 50-year-old Caucasian male (Schering). PROTDNV09 PCR2-TOPOTA Library was constructed using pooled cDNA from 106 different donors. cDNA was generated using mRNA isolated from lung tissue removed from male Caucasian fetus (donor A) who died from fetal demise; from brain and small intestine tissue removed from a 23-week-old Caucasian male fetus (donor B) who died from premature birth; from brain tissue removed from a Caucasian male fetus (donor C) who was stillborn with a hypoplastic left heart at 23 weeks' gestation; from liver tumor tissue removed from a 72-year-old Caucasian male (donor D) during partial hepatectomy; from left frontal/parietal brain tumor tissue removed from a 2- year-old Caucasian female (donor E) during excision of cerebral meningeal lesion; from pleural tumor tissue removed from a 55-year-old Caucasian female (donor F) during complete pneumonectomy; from liver tissue removed from a pool of thirty-two, 18 to 24-week-old male and female fetuses (donor G) who died from spontaneous abortions; from kidney tissue removed from a pool of fifty-nine 20 to 33-week-old male and female fetuses (donor H) who died from spontaneous abortions; and from thymus tissue removed from a pool of nine 18 to 32-year-old males and females (donor I) who died from sudden death. For donors A, B, and C, serologies were negative. For donor B, family history included diabetes in the mother. For donor D, pathology indicated metastatic grade 2 (of 4) neuroendocrine carcinoma of the right liver lobe. The patient presented with secondary malignant neoplasm of the liver. Patient history included benign hypertension, type I diabetes, hyperplasia of the prostate, malignant prostate neoplasm, and tobacco and alcohol abuse in remission. Previous surgeries included excision/destruction of a pancreas lesion (insulinoma), closed prostatic biopsy, transurethral prostatectomy, and excision of both testes. Patient medications included Eulexin, Hytrin, Proscar, Ecotrin, and insulin. Family history included acute myocardial infarction and atherosclerotic coronary artery disease in the mother, and atherosclerotic coronary artery disease and type II diabetes in the father. For donor E, pathology indicated primitive neuroectodermal tumor with advanced ganglionic differentiation. The lesion was only moderately cellular but was mitotically active with a high MTB-1 labelling index. Neuronal differentiation was widespread and advanced. Multnucleate and dysplastic-appearing forms were readily seen. The glial element was less prominent. Synaptophysin, GFAP, and S-100 were positive. The patient presented with malignant brain neoplasm and motor seizures. The patient was not taking any medications. Family history included benign hypertension in the grandparent(s). For donor F, pathology indicated grade 3 sarcoma most consistent with leiomyosarcoma, uterine primary, involving the parietal pleura. The patient presented with secondary malignant lung neoplasm and shortness of breath. Patient history included peptic ulcer disease, malignant uterine neoplasm, normal delivery, deficiency anemia, and tobacco abuse in remission. Previous surgeries included total abdominal hysterectomy, bilateral salpingo-oophorectomy, hemorrhoidectomy, endoscopic excision of lung lesion, and incidental appendectomy. Patient medications included Megace, Pepcid and tamoxifen. Family history included atherosclerotic coronary artery disease and type II diabetes in the father; multiple sclerosis in the mother; and malignant breast neoplasm in the grandparent(s). SINTNON02 pINCY This normalized small intestine tissue library was constructed from 1.84 million independent clones from a pooled small intestine tissue library. Starting RNA was made from pooled cDNA from six different donors. cDNA was generated using mRNA isolated from small intestine tissue removed from a Caucasian male fetus (donor A) who died from fetal demise; from small intestine tissue removed from a 8-year-old Black male (donor B) who died from anoxia; from small intestine tissue removed from a 13-year-old Caucasian male (donor C) who died from a gunshot wound to the head; from jejunum and duodenum tissue removed from the small intestine of a 16-year-old Caucasian male (donor D) who died from head trauma; from ileum tissue removed from an 8-year-old Caucasian female (donor E) who died from head trauma; and from small intestine tissue removed from a 15-year-old Caucasian female who died from a closed head injury. Serologies were negative for donors A, B, C and D. Donors E and F had serologies positive for cytomegalovirus (CMV). Donor B's medications included DDAVP, Versed, and labetalol. Donor C's previous surgeries included a hernia repair. The patient was not taking any medications. Family history included diabetes in the grandparent(s). Donor D's history included a kidney infection three years prior to death, marijuana use, and tobacco use. Donor E's history included migraine headaches and urinary tract infection. Previous surgeries included an adenotonsillectomy. Patient medications included Dilantin (phenytoin), Ancef (cephalosporin), and Zantac (ranitidine). Donor F's history included seasonal allergies and marijuana use. Patient medications included Dopamine and Neo-Synephrine. The library was normalized in two rounds using conditions adapted from Soares et al. (1994) Proc. Natl. Acad. Sci. USA 91: 9228-9232 and Bonaldo et al. (1996) Genome Res. 6: 791-806, except that a significantly longer (48 hours/round) reannealing hybridization was used. THYRNOT03 pINCY Library was constructed using RNA isolated from thyroid tissue removed from the left thyroid of a 28-year-old Caucasian female during a complete thyroidectomy. Pathology indicated a small nodule of adenomatous hyperplasia present in the left thyroid. Pathology for the associated tumor tissue indicated dominant follicular adenoma, forming a well-encapsulated mass in the left thyroid. UTRSDIC01 PSPORT1 This large size fractionated library was constructed using pooled cDNA from eight donors. cDNA was generated using mRNA isolated from endometrial tissue removed from a 32-year-old female (donor A); endometrial tissue removed from a 32-year-old Caucasian female (donor B) during abdominal hysterectomy, bilateral salpingo-oophorectomy, and cystocele repair; from diseased endometrium and myometrium tissue removed from a 38-year-old Caucasian female (donor C) during abdominal hysterectomy, bilateral salpingo-oophorectomy, and exploratory laparotomy; from endometrial tissue removed from a 41-year-old Caucasian female (donor D) during abdominal hysterectomy with removal of a solitary ovary; from endometrial tissue removed from a 43-year-old Caucasian female (donor E) during vaginal hysterectomy, dilation and curettage, cystocele repair, rectocele repair and cystostomy; and from endometrial tissue removed from a 48-year-old Caucasian female (donor F) during a vaginal hysterectomy, rectocele repair, and bilateral salpingo-oophorectomy. Pathology (A) indicated the endometrium was in secretory phase. Pathology (B) indicated the endometrium was in the proliferative phase. Pathology (C) indicated extensive adenomatous hyperplasia with squamous metaplasia and focal atypia, forming a polypoid mass within the endometrial cavity. The cervix showed chronic cervicitis and squamous metaplasia. Pathology (D, E) indicated the endometrium was secretory phase. Pathology (F) indicated the endometrium was weakly proliferative. UTRSNOT02 PSPORT1 Library was constructed using RNA isolated from uterine tissue removed from a 34-year-old Caucasian female during a vaginal hysterectomy. Patient history included mitral valve disorder. Family history included stomach cancer, congenital heart anomaly, irritable bowel syndrome, ulcerative colitis, colon cancer, cerebrovascular disease, type II diabetes, and depression.

[0448] TABLE 7 Program Description Reference Parameter Threshold ABI FACTURA A program that removes vector sequences and masks Applied Biosystems, Foster City, CA. ambiguous bases in nucleic acid sequences. ABI/ A Fast Data Finder useful in comparing and Applied Biosystems, Foster City, CA; Mismatch <50% PARACEL FDF annotating amino acid or nucleic acid sequences. Paracel Inc., Pasadena, CA. ABI A program that assembles nucleic acid sequences. Applied Biosystems, Foster City, CA. AutoAssembler BLAST A Basic Local Alignment Search Tool useful in Altschul, S. F. et al. (1990) J. Mol. Biol. ESTs: Probability sequence similarity search for amino acid and nucleic 215: 403-410; Altschul, S. F. et al. (1997) value = 1.0E−8 acid sequences. BLAST includes five functions: Nucleic Acids Res. 25: 3389-3402. or less blastp, blastn, blastx, tblastn, and tblastx. Full Length sequences: Probability value = 1.0E−10 or less FASTA A Pearson and Lipman algorithm that searches for Pearson, W. R. and D. J. Lipman (1988) Proc. ESTs: fasta E similarity between a query sequence and a group of Natl. Acad Sci. USA 85: 2444-2448; Pearson, value = 1.06E−6 sequences of the same type. FASTA comprises as W. R. (1990) Methods Enzymol. 183: 63-98; Assembled ESTs: least five functions: fasta, tfasta, fastx, tfastx, and and Smith, T. F. and M. S. Waterman (1981) fasta Identity = ssearch. Adv. Appl. Math. 2: 482-489. 95% or greater and Match length = 200 bases or greater; fastx E value = 1.0E−8 or less Full Length sequences: fastx score = 100 or greater BLIMPS A BLocks IMProved Searcher that matches a Henikoff, S. and J. G. Henikoff (1991) Probability value = sequence against those in BLOCKS, PRINTS, Nucleic Acids Res. 19: 6565-6572; Henikoff, 1.0E−3 or less DOMO, PRODOM, and PFAM databases to search J. G. and S. Henikoff (1996) Methods for gene families, sequence homology, and structural Enzymol. 266: 88-105; and Attwood, T. K. et fingerprint regions. al. (1997) J. Chem. Inf. Comput. Sci. 37: 417- 424. HMMER An algorithm for searching a query sequence against Krogh, A. et al. (1994) J. Mol. Biol. PFAM, INCY, SMART hidden Markov model (HMM)-based databases of 235: 1501-1531; Sonnhammer, E. L. L. et al. or TIGRFAM hits: protein family consensus sequences, such as PFAM, (1988) Nucleic Acids Res. 26: 320-322; Probability value = INCY, SMART and TIGRFAM. Durbin, R. et al. (1998) Our World View, in 1.0E−3 or less a Nutshell, Cambridge Univ. Press, pp. 1- Signal peptide 350. hits: Score = 0 or greater ProfileScan An algorithm that searches for structural and Gribskov, M. et al. (1988) CABIOS 4: 61-66; Normalized quality sequence motifs in protein sequences that match Gribskov, M. et al. (1989) Methods score ≧ GCG- sequence patterns defined in Prosite. Enzymol. 183: 146-159; Bairoch, A. et al. specified “HIGH” (1997) Nucleic Acids Res. 25: 217-221. value for that particular Prosite motif. Generally, score = 1.4-2.1. Phred A base-calling algorithm that examines automated Ewing, B. et al. (1998) Genome Res. 8: 175- sequencer traces with high sensitivity and probability. 185; Ewing, B. and P. Green (1998) Genome Res. 8: 186-194. Phrap A Phils Revised Assembly Program including Smith, T. F. and M. S. Waterman (1981) Adv. Score = 120 SWAT and CrossMatch, programs based on efficient Appl. Math. 2: 482-489; Smith, T. F. and or greater; Match implementation of the Smith-Waterman algorithm, M. S. Waterman (1981) J. Mol. Biol. 147: 195- length = 56 useful in searching sequence homology and 197; and Green, P., University of or greater assembling DNA sequences. Washington, Seattle, WA. Consed A graphical tool for viewing and editing Phrap Gordon, D. et al. (1998) Genome Res. 8: 195- assemblies. 202. SPScan A weight matrix analysis program that scans protein Nielson, H. et al. (1997) Protein Engineering Score = 3.5 sequences for the presence of secretory signal 10: 1-6; Claverie, J. M. and S. Audic (1997) or greater peptides. CABIOS 12: 431-439. TMAP A program that uses weight matrices to delineate Persson, B. and P. Argos (1994) J. Mol. Biol. transmembrane segments on protein sequences and 237: 182-192; Persson, B. and P. Argos determine orientation. (1996) Protein Sci. 5: 363-371. TMHMMER A program that uses a hidden Markov model (HMM) Sonnhammer, E. L. et al. (1998) Proc. Sixth to delineate transmembrane segments on protein Intl. Conf. on Intelligent Systems for Mol. sequences and determine orientation. Biol., Glasgow et al., eds., The Am. Assoc. for Artificial Intelligence Press, Menlo Park, CA, pp. 175-182. Motifs A program that searches amino acid sequences for Bairoch, A. et al. (1997) Nucleic Acids Res. patterns that matched those defined in Prosite. 25: 217-221; Wisconsin Package Program Manual, version 9, page M51-59, Genetics Computer Group, Madison, WI.

[0449]

1 46 1 1354 PRT Homo sapiens misc_feature Incyte ID No 7461789CD1 1 Met Ala Asp Pro Leu Arg Arg Thr Leu Ser Arg Leu Arg Gly Arg 1 5 10 15 Arg Gly Pro Arg Gly Thr Gly Gly Leu Gly Leu Arg Ala Ala Ala 20 25 30 Ala Ala Ala Val Ala Ala Ser Ser Ala Ala Ala Gly Asp Ala Trp 35 40 45 Gly Ala Ala Asp Thr Leu Pro Arg Glu His Ala Gly Gly His Gly 50 55 60 Arg Ser Leu Gln Gln Pro Ser Pro Ser Pro Glu Ala Trp Gly Pro 65 70 75 Gly Ala Arg Val Pro Gly Gly His Pro Glu Gln Leu Gly Ala Leu 80 85 90 Gly Pro Arg Pro Arg Gly Gly Gln Glu Ala Ala Pro Gln Ser His 95 100 105 Gly Leu Ala His Ala Pro Pro His Ser Pro Glu Gly Ser Glu Gly 110 115 120 Ser Gly Glu Glu Glu Glu Asp Asp Glu Asp Glu Asp Asp Tyr Asp 125 130 135 Ala Asp Tyr Tyr Glu Asn Leu Pro Gly Gly Ser Gln Ser Ala Pro 140 145 150 Glu Pro Glu Gly Ala Glu Ala Glu Arg Arg Pro Pro Pro Pro Pro 155 160 165 Ala Ala Gly Ser Ser Leu Gly Ala Glu Gly Gly Arg Leu Glu Thr 170 175 180 Gly Arg Leu Arg Thr Gln Leu Arg Glu Ala Tyr Tyr Leu Leu Ile 185 190 195 Gln Ala Met His Asp Leu Pro Pro Asp Ser Gly Ala Arg Arg Gly 200 205 210 Gly Arg Gly Leu Ala Asp His Ser Phe Pro Ala Gly Ala Arg Ala 215 220 225 Pro Gly Gln Pro Pro Ser Arg Gly Ala Ala Tyr Arg Arg Ala Cys 230 235 240 Pro Arg Asp Gly Glu Arg Gly Gly Gly Gly Arg Pro Arg Gln Gln 245 250 255 Val Ser Pro Pro Arg Ser Pro Gln Arg Glu Pro Arg Gly Gly Gln 260 265 270 Leu Arg Thr Pro Arg Met Arg Pro Ser Cys Ser Arg Ser Leu Glu 275 280 285 Ser Leu Arg Val Gly Ala Lys Pro Pro Pro Phe Gln Arg Trp Pro 290 295 300 Ser Asp Ser Trp Ile Arg Leu Gln Gly Pro Arg Leu Leu Leu Gly 305 310 315 Lys Pro Phe Arg Asp Pro Ala Gly Ser Ser Val Ile Arg Ser Gly 320 325 330 Lys Gly Asp Arg Pro Glu Gly Pro Ser Phe Leu Arg Pro Pro Ala 335 340 345 Val Thr Val Lys Lys Leu Gln Lys Trp Met Tyr Lys Gly Arg Leu 350 355 360 Leu Ser Leu Gly Met Lys Gly Arg Ala Arg Gly Thr Ala Pro Lys 365 370 375 Val Thr Gly Thr Gln Ala Ala Ser Pro Asn Val Gly Ala Leu Lys 380 385 390 Val Arg Glu Asn Arg Val Leu Ser Val Pro Pro Asp Gln Arg Ile 395 400 405 Thr Leu Thr Asp Leu Phe Glu Asn Ala Tyr Gly Ser Ser Met Lys 410 415 420 Gly Arg Glu Leu Glu Glu Leu Lys Asp Asn Ile Glu Phe Arg Gly 425 430 435 His Lys Pro Leu Asn Ser Ile Thr Val Ser Lys Lys Arg Asn Trp 440 445 450 Leu Tyr Gln Ser Thr Leu Arg Pro Leu Asn Leu Glu Glu Glu Asn 455 460 465 Lys Lys Cys Gln Asp Arg Ser His Leu Ser Ile Ser Pro Val Ser 470 475 480 Leu Pro Lys His Gln Leu Ser Gln Ser Phe Leu Lys Ser Ser Lys 485 490 495 Glu Tyr Cys Thr Tyr Val Val Cys Asn Ala Thr Asn Ser Ser Leu 500 505 510 Ser Lys Asn Cys Ala Leu Asp Phe Asn Glu Glu Asn Asp Ala Asp 515 520 525 Asp Glu Gly Glu Ile Trp Tyr Asn Pro Ile Pro Glu Asp Asp Asp 530 535 540 Leu Gly Ile Ser Ser Ala Leu Ser Phe Gly Glu Ala Asp Ser Ala 545 550 555 Val Leu Lys Leu Pro Ala Val Asn Leu Ser Met Leu Ser Gly Ser 560 565 570 Asp Leu Met Lys Ala Glu Arg His Thr Glu Asp Ser Leu Cys Ser 575 580 585 Ser Glu His Ala Gly Asp Ile Gln Thr Thr Arg Ser Asn Gly Met 590 595 600 Asn Pro Ile His Pro Ala His Ser Thr Glu Phe Val Gln Gln Tyr 605 610 615 Lys Gln Lys Leu Gly His Lys Thr Gln Glu Gly Ile Met Val Glu 620 625 630 Asp Ser Pro Met Leu Lys Ser Pro Phe Ala Gly Ser Gly Ile Leu 635 640 645 Ala Ala Thr Asn Ser Thr Glu Leu Gly Ile Met Glu Pro Ser Ser 650 655 660 Pro Asn Pro Ser Pro Val Lys Lys Gly Ser Ser Ile Asn Trp Ser 665 670 675 Leu Pro Asp Lys Ile Lys Ser Pro Arg Thr Val Arg Lys Leu Ser 680 685 690 Met Lys Met Lys Lys Leu Pro Glu Phe Ser Arg Lys Leu Ser Val 695 700 705 Lys Gly Thr Leu Asn Tyr Ile Asn Ser Pro Asp Asn Thr Pro Ser 710 715 720 Leu Ser Lys Tyr Asn Cys Arg Glu Ile His His Thr Asp Ile Leu 725 730 735 Pro Ser Gly Asn Thr Thr Thr Ala Ala Lys Arg Asn Val Ile Ser 740 745 750 Arg Tyr His Leu Asp Thr Ser Val Ser Ser Gln Gln Ser Tyr Gln 755 760 765 Lys Lys Asn Ser Met Ser Ser Lys Tyr Ser Cys Lys Gly Gly Tyr 770 775 780 Leu Ser Asp Gly Asp Ser Pro Glu Leu Thr Thr Lys Ala Ser Lys 785 790 795 His Gly Ser Glu Asn Lys Phe Gly Lys Gly Lys Glu Ile Ile Ser 800 805 810 Asn Ser Cys Ser Lys Asn Glu Ile Asp Ile Asp Ala Phe Arg His 815 820 825 Tyr Ser Phe Ser Asp Gln Pro Lys Cys Ser Gln Tyr Ile Ser Gly 830 835 840 Leu Met Ser Val His Phe Tyr Gly Ala Glu Asp Leu Lys Pro Pro 845 850 855 Arg Ile Asp Ser Lys Asp Val Phe Cys Ala Ile Gln Val Asp Ser 860 865 870 Val Asn Lys Ala Arg Thr Ala Leu Leu Thr Cys Arg Thr Thr Phe 875 880 885 Leu Asp Met Asp His Thr Phe Asn Ile Glu Ile Glu Asn Ala Gln 890 895 900 His Leu Lys Leu Val Val Phe Ser Trp Glu Pro Thr Pro Arg Lys 905 910 915 Asn Arg Val Cys Cys His Gly Thr Val Val Leu Pro Thr Leu Phe 920 925 930 Arg Val Thr Lys Thr His Gln Leu Ala Val Lys Leu Glu Pro Arg 935 940 945 Gly Leu Ile Tyr Val Lys Val Thr Leu Met Glu Gln Trp Glu Asn 950 955 960 Ser Leu His Gly Leu Asp Ile Asn Gln Glu Pro Ile Ile Phe Gly 965 970 975 Val Asp Ile Gln Lys Val Val Glu Lys Glu Asn Ile Gly Leu Met 980 985 990 Val Pro Leu Leu Ile Gln Lys Cys Ile Met Glu Ile Glu Lys Arg 995 1000 1005 Gly Cys Gln Val Val Gly Leu Tyr Arg Leu Cys Gly Ser Ala Ala 1010 1015 1020 Val Lys Lys Glu Leu Arg Glu Ala Phe Glu Arg Asp Ser Lys Ala 1025 1030 1035 Val Gly Leu Cys Glu Asn Gln Tyr Pro Asp Ile Asn Val Ile Thr 1040 1045 1050 Gly Val Leu Lys Asp Tyr Leu Arg Glu Leu Pro Ser Pro Leu Ile 1055 1060 1065 Thr Lys Gln Leu Tyr Glu Ala Val Leu Asp Ala Met Ala Lys Ser 1070 1075 1080 Pro Leu Lys Met Ser Ser Asn Gly Cys Glu Asn Asp Pro Gly Asp 1085 1090 1095 Ser Lys Tyr Thr Val Asp Leu Leu Asp Cys Leu Pro Glu Ile Glu 1100 1105 1110 Lys Ala Thr Leu Lys Met Leu Leu Asp His Leu Lys Leu Val Ala 1115 1120 1125 Ser Tyr His Glu Val Asn Lys Met Thr Cys Gln Asn Leu Ala Val 1130 1135 1140 Cys Phe Gly Pro Val Leu Leu Ser Gln Arg Gln Glu Pro Ser Thr 1145 1150 1155 His Asn Asn Arg Val Phe Thr Asp Ser Glu Glu Leu Ala Ser Ala 1160 1165 1170 Leu Asp Phe Lys Lys His Ile Glu Val Leu His Tyr Leu Leu Gln 1175 1180 1185 Leu Trp Pro Val Gln Arg Leu Thr Val Lys Lys Ser Thr Asp Asn 1190 1195 1200 Leu Phe Pro Glu Gln Lys Ser Ser Leu Asn Tyr Leu Arg Gln Lys 1205 1210 1215 Lys Glu Arg Pro His Met Leu Asn Leu Ser Gly Thr Asp Ser Ser 1220 1225 1230 Gly Val Leu Arg Pro Arg Gln Asn Arg Leu Asp Ser Pro Leu Ser 1235 1240 1245 Asn Arg Tyr Ala Gly Asp Trp Ser Ser Cys Gly Glu Asn Tyr Phe 1250 1255 1260 Leu Asn Thr Lys Glu Asn Leu Asn Asp Val Asp Tyr Asp Asp Val 1265 1270 1275 Pro Ser Glu Asp Arg Lys Ile Gly Glu Asn Tyr Ser Lys Met Asp 1280 1285 1290 Gly Pro Glu Val Met Ile Glu Gln Pro Ile Pro Met Ser Lys Glu 1295 1300 1305 Cys Thr Phe Gln Thr Tyr Leu Thr Met Gln Thr Ile Glu Ser Thr 1310 1315 1320 Val Asp Arg Lys Asn Asn Leu Lys Asp Leu Gln Glu Ser Ile Asp 1325 1330 1335 Thr Leu Ile Gly Asn Leu Glu Arg Glu Leu Asn Lys Asn Lys Leu 1340 1345 1350 Asn Met Ser Phe 2 200 PRT Homo sapiens misc_feature Incyte ID No 1210450CD1 2 Met Glu Lys Pro Tyr Asn Lys Asn Glu Gly Asn Leu Glu Asn Glu 1 5 10 15 Gly Lys Pro Glu Asp Glu Val Glu Pro Asp Asp Glu Gly Lys Ser 20 25 30 Asp Glu Glu Glu Lys Pro Asp Val Glu Gly Lys Thr Glu Cys Glu 35 40 45 Gly Lys Arg Glu Asp Glu Gly Glu Pro Gly Asp Glu Gly Gln Leu 50 55 60 Glu Asp Glu Gly Ser Gln Glu Lys Gln Gly Arg Ser Glu Gly Glu 65 70 75 Gly Lys Pro Gln Gly Glu Gly Lys Pro Ala Ser Gln Ala Lys Pro 80 85 90 Glu Ser Gln Pro Arg Ala Ala Glu Lys Arg Pro Ala Glu Asp Tyr 95 100 105 Val Pro Arg Lys Ala Lys Arg Lys Thr Asp Arg Gly Thr Asp Asp 110 115 120 Ser Pro Lys Asp Ser Gln Glu Asp Leu Gln Glu Arg His Leu Ser 125 130 135 Ser Glu Glu Met Met Arg Glu Cys Gly Asp Val Ser Arg Ala Gln 140 145 150 Glu Glu Leu Arg Lys Lys Gln Lys Met Gly Gly Phe His Trp Met 155 160 165 Gln Arg Asp Val Gln Asp Pro Phe Ala Pro Arg Gly Gln Arg Gly 170 175 180 Val Arg Gly Val Arg Gly Gly Gly Arg Gly Gln Lys Asp Leu Glu 185 190 195 Asp Val Pro Tyr Val 200 3 403 PRT Homo sapiens misc_feature Incyte ID No 427539CD1 3 Met Glu Glu Glu Arg Gly Ser Ala Leu Ala Ala Glu Ser Ala Leu 1 5 10 15 Glu Lys Asn Val Ala Glu Leu Thr Val Met Asp Val Tyr Asp Ile 20 25 30 Ala Ser Leu Val Gly His Glu Phe Glu Arg Val Ile Asp Gln His 35 40 45 Gly Cys Glu Ala Ile Ala Arg Leu Met Pro Lys Val Val Arg Val 50 55 60 Leu Glu Ile Leu Glu Val Leu Val Ser Arg His His Val Ala Pro 65 70 75 Glu Leu Asp Glu Leu Arg Leu Glu Leu Asp Arg Leu Arg Leu Glu 80 85 90 Arg Met Asp Arg Ile Glu Lys Glu Arg Lys His Gln Lys Glu Leu 95 100 105 Glu Leu Val Glu Asp Val Trp Arg Gly Glu Ala Gln Asp Leu Leu 110 115 120 Ser Gln Ile Ala Gln Leu Gln Glu Glu Asn Lys Gln Leu Met Thr 125 130 135 Asn Leu Ser His Lys Asp Val Asn Phe Ser Glu Glu Glu Phe Gln 140 145 150 Lys His Glu Gly Met Ser Glu Arg Glu Arg Gln Val Met Lys Lys 155 160 165 Leu Lys Glu Val Val Asp Lys Gln Arg Asp Glu Ile Arg Ala Lys 170 175 180 Asp Arg Glu Leu Gly Leu Lys Asn Glu Asp Val Glu Ala Leu Gln 185 190 195 Gln Gln Gln Thr Arg Leu Met Lys Ile Asn His Asp Leu Arg His 200 205 210 Arg Val Thr Val Val Glu Ala Gln Gly Lys Ala Leu Ile Glu Gln 215 220 225 Lys Val Glu Leu Glu Ala Asp Leu Gln Thr Lys Glu Gln Glu Met 230 235 240 Gly Ser Leu Arg Ala Glu Leu Gly Lys Leu Arg Glu Arg Leu Gln 245 250 255 Gly Glu His Ser Gln Asn Gly Glu Glu Glu Pro Glu Thr Glu Pro 260 265 270 Val Gly Glu Glu Ser Ile Ser Asp Ala Glu Lys Val Ala Met Asp 275 280 285 Leu Lys Asp Pro Asn Arg Pro Arg Phe Thr Leu Gln Glu Leu Arg 290 295 300 Asp Val Leu His Glu Arg Asn Glu Leu Lys Ser Lys Val Phe Leu 305 310 315 Leu Gln Glu Glu Leu Ala Tyr Tyr Lys Ser Glu Glu Met Glu Glu 320 325 330 Glu Asn Arg Ile Pro Gln Pro Pro Pro Ile Ala His Pro Arg Thr 335 340 345 Ser Pro Gln Pro Glu Ser Gly Ile Lys Arg Leu Phe Ser Phe Phe 350 355 360 Ser Arg Asp Lys Lys Arg Leu Ala Asn Thr Gln Arg Asn Val His 365 370 375 Ile Gln Glu Ser Phe Gly Gln Trp Ala Asn Thr His Arg Asp Asp 380 385 390 Gly Tyr Thr Glu Gln Gly Gln Glu Ala Leu Gln His Leu 395 400 4 1255 PRT Homo sapiens misc_feature Incyte ID No 1545043CD1 4 Met Leu Asp Pro Ser Ser Ser Glu Glu Glu Ser Asp Glu Gly Leu 1 5 10 15 Glu Glu Glu Ser Arg Asp Val Leu Val Ala Ala Gly Ser Ser Gln 20 25 30 Arg Ala Pro Pro Ala Pro Thr Arg Glu Gly Arg Arg Asp Ala Pro 35 40 45 Gly Arg Ala Gly Gly Gly Gly Ala Ala Arg Ser Val Ser Pro Ser 50 55 60 Pro Ser Val Leu Ser Glu Gly Arg Asp Glu Pro Gln Arg Gln Leu 65 70 75 Asp Asp Glu Gln Glu Arg Arg Ile Arg Leu Gln Leu Tyr Val Phe 80 85 90 Val Val Arg Cys Ile Ala Tyr Pro Phe Asn Ala Lys Gln Pro Thr 95 100 105 Asp Met Ala Arg Arg Gln Gln Lys Leu Asn Lys Gln Gln Leu Gln 110 115 120 Leu Leu Lys Glu Arg Phe Gln Ala Phe Leu Asn Gly Glu Thr Gln 125 130 135 Ile Val Ala Asp Glu Ala Phe Cys Asn Ala Val Arg Ser Tyr Tyr 140 145 150 Glu Val Phe Leu Lys Ser Asp Arg Val Ala Arg Met Val Gln Ser 155 160 165 Gly Gly Cys Ser Ala Asn Asp Phe Arg Glu Val Phe Lys Lys Asn 170 175 180 Ile Glu Lys Arg Val Arg Ser Leu Pro Glu Ile Asp Gly Leu Ser 185 190 195 Lys Glu Thr Val Leu Ser Ser Trp Ile Ala Lys Tyr Asp Ala Ile 200 205 210 Tyr Arg Gly Glu Glu Asp Leu Cys Lys Gln Pro Asn Arg Met Ala 215 220 225 Leu Ser Ala Val Ser Glu Leu Ile Leu Ser Lys Glu Gln Leu Tyr 230 235 240 Glu Met Phe Gln Gln Ile Leu Gly Ile Lys Lys Leu Glu His Gln 245 250 255 Leu Leu Tyr Asn Ala Cys Gln Leu Asp Asn Ala Asp Glu Gln Ala 260 265 270 Ala Gln Ile Arg Arg Glu Leu Asp Gly Arg Leu Gln Leu Ala Asp 275 280 285 Lys Met Ala Lys Glu Arg Lys Phe Pro Lys Phe Ile Ala Lys Asp 290 295 300 Met Glu Asn Met Tyr Ile Glu Glu Leu Arg Ser Ser Val Asn Leu 305 310 315 Leu Met Ala Asn Leu Glu Ser Leu Pro Val Ser Lys Gly Gly Pro 320 325 330 Glu Phe Lys Leu Gln Lys Leu Lys Arg Ser Gln Asn Ser Ala Phe 335 340 345 Leu Asp Ile Gly Asp Glu Asn Glu Ile Gln Leu Ser Lys Ser Asp 350 355 360 Val Val Leu Ser Phe Thr Leu Glu Ile Val Ile Met Glu Val Gln 365 370 375 Gly Leu Lys Ser Val Ala Pro Asn Arg Ile Val Tyr Cys Thr Met 380 385 390 Glu Val Glu Gly Glu Lys Leu Gln Thr Asp Gln Ala Glu Ala Ser 395 400 405 Arg Pro Gln Trp Gly Thr Gln Gly Asp Phe Thr Thr Thr His Pro 410 415 420 Arg Pro Val Val Lys Val Lys Leu Phe Thr Glu Ser Thr Gly Val 425 430 435 Leu Ala Leu Glu Asp Lys Glu Leu Gly Arg Val Ile Leu Tyr Pro 440 445 450 Thr Ser Asn Ser Ser Lys Ser Ala Glu Leu His Arg Met Val Val 455 460 465 Pro Lys Asn Ser Gln Asp Ser Asp Leu Lys Ile Lys Leu Ala Val 470 475 480 Arg Met Asp Lys Pro Ala His Met Lys His Ser Gly Tyr Leu Tyr 485 490 495 Ala Leu Gly Gln Lys Val Trp Lys Arg Trp Lys Lys Arg Tyr Phe 500 505 510 Val Leu Val Gln Val Ser Gln Tyr Thr Phe Ala Met Cys Ser Tyr 515 520 525 Arg Glu Lys Lys Ser Glu Pro Gln Glu Leu Met Gln Leu Glu Gly 530 535 540 Tyr Thr Val Asp Tyr Thr Asp Pro His Pro Gly Leu Gln Gly Gly 545 550 555 Cys Met Phe Phe Asn Ala Val Lys Glu Gly Asp Thr Val Ile Phe 560 565 570 Ala Ser Asp Asp Glu Gln Asp Arg Ile Leu Trp Val Gln Ala Met 575 580 585 Tyr Arg Ala Thr Gly Gln Ser Tyr Lys Pro Val Pro Ala Ile Gln 590 595 600 Thr Gln Lys Leu Asn Pro Lys Gly Gly Thr Leu His Ala Asp Ala 605 610 615 Gln Leu Tyr Ala Asp Arg Phe Gln Lys His Gly Met Asp Glu Phe 620 625 630 Ile Ser Ala Asn Pro Cys Lys Leu Asp His Ala Phe Leu Phe Arg 635 640 645 Ile Leu Gln Arg Gln Thr Leu Asp His Arg Leu Asn Asp Ser Tyr 650 655 660 Ser Cys Leu Gly Trp Phe Ser Pro Gly Gln Val Phe Val Leu Asp 665 670 675 Glu Tyr Cys Ala Arg Tyr Gly Val Arg Gly Cys His Arg His Leu 680 685 690 Cys Tyr Leu Ala Glu Leu Met Glu His Ser Glu Asn Gly Ala Val 695 700 705 Ile Asp Pro Thr Leu Leu His Tyr Ser Phe Ala Phe Cys Ala Ser 710 715 720 His Val His Gly Asn Arg Pro Asp Gly Ile Gly Thr Val Ser Val 725 730 735 Glu Glu Lys Glu Arg Phe Glu Glu Ile Lys Glu Arg Leu Ser Ser 740 745 750 Leu Leu Glu Asn Gln Ile Ser His Phe Arg Tyr Cys Phe Pro Phe 755 760 765 Gly Arg Pro Glu Gly Ala Leu Lys Ala Thr Leu Ser Leu Leu Glu 770 775 780 Arg Val Leu Met Lys Asp Ile Ala Thr Pro Ile Pro Ala Glu Glu 785 790 795 Val Lys Lys Val Val Arg Lys Cys Leu Glu Lys Ala Ala Leu Ile 800 805 810 Asn Tyr Thr Arg Leu Thr Glu Tyr Ala Lys Ile Glu Glu Thr Met 815 820 825 Asn Gln Ala Ser Pro Ala Arg Lys Leu Glu Glu Ile Leu His Leu 830 835 840 Ala Glu Leu Cys Ile Glu Val Leu Gln Gln Asn Glu Glu His His 845 850 855 Ala Glu Ala Phe Ala Trp Trp Pro Asp Leu Leu Ala Glu His Ala 860 865 870 Glu Lys Phe Trp Ala Leu Phe Thr Val Asp Met Asp Thr Ala Leu 875 880 885 Glu Ala Gln Pro Gln Asp Ser Trp Asp Ser Phe Pro Leu Phe Gln 890 895 900 Leu Leu Asn Asn Phe Leu Arg Asn Asp Thr Leu Leu Cys Asn Gly 905 910 915 Lys Phe His Lys His Leu Gln Glu Ile Phe Val Pro Leu Val Val 920 925 930 Arg Tyr Val Asp Leu Met Glu Ser Ser Ile Ala Gln Ser Ile His 935 940 945 Arg Gly Phe Glu Gln Glu Thr Trp Gln Pro Val Asn Asn Gly Ser 950 955 960 Ala Thr Ser Glu Asp Leu Phe Trp Lys Leu Asp Ala Leu Gln Met 965 970 975 Phe Val Phe Asp Leu His Trp Pro Glu Gln Glu Phe Ala His His 980 985 990 Leu Glu Gln Arg Leu Lys Leu Met Ala Ser Asp Met Leu Glu Ala 995 1000 1005 Cys Val Lys Arg Thr Arg Thr Ala Phe Glu Leu Lys Leu Gln Lys 1010 1015 1020 Ala Ser Lys Thr Thr Asp Leu Arg Ile Pro Ala Ser Val Cys Thr 1025 1030 1035 Met Phe Asn Val Leu Val Asp Ala Lys Lys Gln Ser Thr Lys Leu 1040 1045 1050 Cys Ala Leu Asp Gly Gly Gln Glu Phe Gly Ser Gln Trp Gln Gln 1055 1060 1065 Tyr His Ser Lys Ile Asp Asp Leu Ile Asp Asn Ser Val Lys Glu 1070 1075 1080 Ile Ile Ser Leu Leu Val Ser Lys Phe Val Ser Val Leu Glu Gly 1085 1090 1095 Val Leu Ser Lys Leu Ser Arg Tyr Asp Glu Gly Thr Phe Phe Ser 1100 1105 1110 Ser Ile Leu Ser Phe Thr Val Lys Ala Ala Ala Lys Tyr Val Asp 1115 1120 1125 Val Pro Lys Pro Gly Met Asp Leu Ala Asp Thr Tyr Ile Met Phe 1130 1135 1140 Val Arg Gln Asn Gln Asp Ile Leu Arg Glu Lys Val Asn Glu Glu 1145 1150 1155 Met Tyr Ile Glu Lys Leu Phe Asp Gln Trp Tyr Ser Ser Ser Met 1160 1165 1170 Lys Val Ile Cys Val Trp Leu Thr Asp Arg Leu Asp Leu Gln Leu 1175 1180 1185 His Ile Tyr Gln Leu Lys Thr Leu Ile Lys Ile Val Lys Lys Thr 1190 1195 1200 Tyr Arg Asp Phe Arg Leu Gln Gly Val Leu Glu Gly Thr Leu Asn 1205 1210 1215 Ser Lys Thr Tyr Asp Thr Val His Arg Arg Leu Thr Val Glu Glu 1220 1225 1230 Ala Thr Ala Ser Val Ser Glu Gly Gly Gly Leu Gln Gly Ile Thr 1235 1240 1245 Met Lys Asp Ser Asp Glu Glu Glu Glu Gly 1250 1255 5 346 PRT Homo sapiens misc_feature Incyte ID No 7488231CD1 5 Met Ala Arg Gly Gly Arg Gly Arg Arg Leu Gly Leu Ala Leu Gly 1 5 10 15 Leu Leu Leu Ala Leu Val Leu Ala Pro Arg Val Leu Arg Ala Lys 20 25 30 Pro Thr Val Arg Lys Glu Arg Val Val Arg Pro Asp Ser Glu Leu 35 40 45 Gly Glu Arg Pro Pro Glu Asp Asn Gln Ser Phe Gln Tyr Asp His 50 55 60 Glu Ala Phe Leu Gly Lys Glu Asp Ser Lys Thr Phe Asp Gln Leu 65 70 75 Thr Pro Asp Glu Ser Lys Glu Asp Ser Lys Thr Phe Asp Gln Leu 80 85 90 Thr Pro Asp Glu Ser Lys Glu Arg Leu Gly Lys Ile Val Asp Arg 95 100 105 Ile Asp Asn Asp Gly Asp Gly Phe Val Thr Thr Glu Glu Leu Lys 110 115 120 Thr Trp Ile Lys Arg Val Gln Lys Arg Tyr Ile Phe Asp Asn Val 125 130 135 Ala Lys Val Trp Lys Ala Tyr Asp Arg Asp Lys Asp Asp Lys Ile 140 145 150 Ser Trp Glu Glu Tyr Lys Gln Ala Thr Tyr Gly Tyr Tyr Leu Gly 155 160 165 Asn Pro Ala Glu Phe His Asp Ser Ser Asp His His Thr Phe Lys 170 175 180 Lys Met Leu Pro Arg Asp Glu Arg Arg Phe Lys Ala Ala Asp Leu 185 190 195 Asn Gly Asp Leu Thr Ala Thr Arg Glu Glu Phe Thr Ala Phe Leu 200 205 210 His Pro Glu Glu Phe Glu His Met Lys Glu Ile Val Val Leu Glu 215 220 225 Thr Leu Glu Asp Ile Asp Lys Asn Gly Asp Gly Phe Val Asp Gln 230 235 240 Asp Glu Tyr Ile Ala Asp Met Phe Ser His Glu Glu Asn Gly Pro 245 250 255 Glu Pro Asp Trp Val Leu Ser Glu Arg Glu Gln Phe Asn Glu Phe 260 265 270 Arg Asp Leu Asn Lys Asp Gly Lys Leu Asp Lys Asp Glu Ile Arg 275 280 285 His Trp Ile Leu Pro Gln Asp Tyr Asp His Ala Gln Ala Glu Ala 290 295 300 Arg His Leu Val Tyr Glu Ser Asp Lys Asn Lys Asp Glu Lys Leu 305 310 315 Thr Lys Glu Glu Ile Leu Glu Asn Trp Asn Met Phe Val Gly Ser 320 325 330 Gln Ala Thr Asn Tyr Gly Glu Asp Leu Thr Lys Asn His Asp Glu 335 340 345 Leu 6 734 PRT Homo sapiens misc_feature Incyte ID No 1910008CD1 6 Met Glu Val Lys Arg Leu Lys Val Thr Glu Leu Arg Ser Glu Leu 1 5 10 15 Gln Arg Arg Gly Leu Asp Ser Arg Gly Leu Lys Val Asp Leu Ala 20 25 30 Gln Arg Leu Gln Glu Ala Leu Asp Ala Glu Met Leu Glu Asp Glu 35 40 45 Ala Gly Gly Gly Gly Ala Gly Pro Gly Gly Ala Cys Lys Ala Glu 50 55 60 Pro Arg Pro Val Ala Ala Ser Gly Gly Gly Pro Gly Gly Asp Glu 65 70 75 Glu Glu Asp Glu Glu Glu Glu Glu Glu Asp Glu Glu Ala Leu Leu 80 85 90 Glu Asp Glu Asp Glu Glu Pro Pro Pro Ala Gln Ala Leu Gly Gln 95 100 105 Ala Ala Gln Pro Pro Pro Glu Pro Pro Glu Ala Ala Ala Met Glu 110 115 120 Ala Ala Ala Glu Pro Asp Ala Ser Glu Lys Pro Ala Glu Ala Thr 125 130 135 Ala Gly Ser Gly Gly Val Asn Gly Gly Glu Glu Gln Gly Leu Gly 140 145 150 Lys Arg Glu Glu Asp Glu Pro Glu Glu Arg Ser Gly Asp Glu Thr 155 160 165 Pro Gly Ser Glu Val Pro Gly Asp Lys Ala Ala Glu Glu Gln Gly 170 175 180 Asp Asp Gln Asp Ser Glu Lys Ser Lys Pro Ala Gly Ser Asp Gly 185 190 195 Glu Arg Arg Gly Val Lys Arg Gln Arg Asp Glu Lys Asp Glu His 200 205 210 Gly Arg Ala Tyr Tyr Glu Phe Arg Glu Glu Ala Tyr His Ser Arg 215 220 225 Ser Lys Ser Pro Leu Pro Pro Glu Glu Glu Ala Lys Asp Glu Glu 230 235 240 Glu Asp Gln Thr Leu Val Asn Leu Asp Thr Tyr Thr Ser Asp Leu 245 250 255 His Phe Gln Val Ser Lys Asp Arg Tyr Gly Gly Gln Pro Leu Phe 260 265 270 Ser Glu Lys Phe Pro Thr Leu Trp Ser Gly Ala Arg Ser Thr Tyr 275 280 285 Gly Val Thr Lys Gly Lys Val Cys Phe Glu Ala Lys Val Thr Gln 290 295 300 Asn Leu Pro Met Lys Glu Gly Cys Thr Glu Val Ser Leu Leu Arg 305 310 315 Val Gly Trp Ser Val Asp Phe Ser Arg Pro Gln Leu Gly Glu Asp 320 325 330 Glu Phe Ser Tyr Gly Phe Asp Gly Arg Gly Leu Lys Ala Glu Asn 335 340 345 Gly Gln Phe Glu Glu Phe Gly Gln Thr Phe Gly Glu Asn Asp Val 350 355 360 Ile Gly Cys Phe Ala Asn Phe Glu Thr Glu Glu Val Glu Leu Ser 365 370 375 Phe Ser Lys Asn Gly Glu Asp Leu Gly Val Ala Phe Trp Ile Ser 380 385 390 Lys Asp Ser Leu Ala Asp Arg Ala Leu Leu Pro His Val Leu Cys 395 400 405 Lys Asn Cys Val Val Glu Leu Asn Phe Gly Gln Lys Glu Glu Pro 410 415 420 Phe Phe Pro Pro Pro Glu Glu Phe Val Phe Ile His Ala Val Pro 425 430 435 Val Glu Glu Arg Val Arg Thr Ala Val Pro Pro Lys Thr Ile Glu 440 445 450 Glu Cys Glu Val Ile Leu Met Val Gly Leu Pro Gly Ser Gly Lys 455 460 465 Thr Gln Trp Ala Leu Lys Tyr Ala Lys Glu Asn Pro Glu Lys Arg 470 475 480 Tyr Asn Val Leu Gly Ala Glu Thr Val Leu Asn Gln Met Arg Met 485 490 495 Lys Gly Leu Glu Glu Pro Glu Met Asp Pro Lys Ser Arg Asp Leu 500 505 510 Leu Val Gln Gln Ala Ser Gln Cys Leu Ser Lys Leu Val Gln Ile 515 520 525 Ala Ser Arg Thr Lys Arg Asn Phe Ile Leu Asp Gln Cys Asn Val 530 535 540 Tyr Asn Ser Gly Gln Arg Arg Lys Leu Leu Leu Phe Lys Thr Phe 545 550 555 Ser Arg Lys Val Val Val Val Val Pro Asn Glu Glu Asp Trp Lys 560 565 570 Lys Arg Leu Glu Leu Arg Lys Glu Val Glu Gly Asp Asp Val Pro 575 580 585 Glu Ser Ile Met Leu Glu Met Lys Ala Asn Phe Ser Leu Pro Glu 590 595 600 Lys Cys Asp Tyr Met Asp Glu Val Thr Tyr Gly Glu Leu Glu Lys 605 610 615 Glu Glu Ala Gln Pro Ile Val Thr Lys Tyr Lys Glu Glu Ala Arg 620 625 630 Lys Leu Leu Pro Pro Ser Glu Lys Arg Thr Asn Arg Arg Asn Asn 635 640 645 Arg Asn Lys Arg Asn Arg Gln Asn Arg Ser Arg Gly Gln Gly Tyr 650 655 660 Val Gly Gly Gln Arg Arg Gly Tyr Asp Asn Arg Ala Tyr Gly Gln 665 670 675 Gln Tyr Trp Gly Gln Pro Gly Asn Arg Gly Gly Tyr Arg Asn Phe 680 685 690 Tyr Asp Arg Tyr Arg Gly Asp Tyr Asp Arg Phe Tyr Gly Arg Asp 695 700 705 Tyr Glu Tyr Asn Lys Gln Gln Asn Cys Thr Phe Phe Leu Lys Phe 710 715 720 Val Glu Lys Asn Ile Val Leu Phe Tyr Lys Thr Phe Gln Thr 725 730 7 636 PRT Homo sapiens misc_feature Incyte ID No 5151459CD1 7 Met Asp Asn Lys Ile Ser Pro Glu Ala Gln Val Ala Glu Leu Glu 1 5 10 15 Leu Asp Ala Val Ile Gly Phe Asn Gly His Val Pro Thr Gly Leu 20 25 30 Lys Cys His Pro Asp Gln Glu His Met Ile Tyr Pro Leu Gly Cys 35 40 45 Thr Val Leu Ile Gln Ala Ile Asn Thr Lys Glu Gln Asn Phe Leu 50 55 60 Gln Gly His Gly Asn Asn Val Ser Cys Leu Ala Ile Ser Arg Ser 65 70 75 Gly Glu Tyr Ile Ala Ser Gly Gln Val Thr Phe Met Gly Phe Lys 80 85 90 Ala Asp Ile Ile Leu Trp Asp Tyr Lys Asn Arg Glu Leu Leu Ala 95 100 105 Arg Leu Ser Leu His Lys Gly Lys Ile Glu Ala Leu Ala Phe Ser 110 115 120 Pro Asn Asp Leu Tyr Leu Val Ser Leu Gly Gly Pro Asp Asp Gly 125 130 135 Ser Val Val Val Trp Ser Ile Ala Lys Arg Asp Ala Ile Cys Gly 140 145 150 Ser Pro Ala Ala Gly Leu Asn Val Gly Asn Ala Thr Asn Val Ile 155 160 165 Phe Ser Arg Cys Arg Asp Glu Met Phe Met Thr Ala Gly Asn Gly 170 175 180 Thr Ile Arg Val Trp Glu Leu Asp Leu Pro Asn Arg Lys Ile Trp 185 190 195 Pro Thr Glu Cys Gln Thr Gly Gln Leu Lys Arg Ile Val Met Ser 200 205 210 Ile Gly Val Asp Asp Asp Asp Ser Phe Phe Tyr Leu Gly Thr Thr 215 220 225 Thr Gly Asp Ile Leu Lys Met Asn Pro Arg Thr Lys Leu Leu Thr 230 235 240 Asp Val Gly Pro Ala Lys Asp Lys Phe Ser Leu Gly Val Ser Ala 245 250 255 Ile Arg Cys Leu Lys Met Gly Gly Leu Leu Val Gly Ser Gly Ala 260 265 270 Gly Leu Leu Val Phe Cys Lys Ser Pro Gly Tyr Lys Pro Ile Lys 275 280 285 Lys Ile Gln Leu Gln Gly Gly Ile Thr Ser Ile Thr Leu Arg Gly 290 295 300 Glu Gly His Gln Phe Leu Val Gly Thr Glu Glu Ser His Ile Tyr 305 310 315 Arg Val Ser Phe Thr Asp Phe Lys Glu Thr Leu Ile Ala Thr Cys 320 325 330 His Phe Asp Ala Val Lys Asp Ile Val Phe Pro Leu Phe Lys Arg 335 340 345 Phe Ser Cys Leu Ser Leu Leu Ser Ser Trp Asp Tyr Ser Gly Thr 350 355 360 Ala Glu Leu Phe Ala Thr Cys Ala Lys Lys Asp Ile Arg Val Trp 365 370 375 His Thr Ser Ser Asn Arg Glu Leu Leu Arg Ile Thr Val Pro Asn 380 385 390 Met Thr Cys His Gly Ile Asp Phe Met Arg Asp Gly Lys Ser Ile 395 400 405 Ile Ser Ala Trp Asn Asp Gly Lys Ile Arg Ala Phe Ala Pro Glu 410 415 420 Thr Gly Arg Leu Met Tyr Val Ile Asn Asn Ala His Arg Ile Gly 425 430 435 Val Thr Ala Ile Ala Thr Thr Ser Asp Cys Lys Arg Val Ile Ser 440 445 450 Gly Gly Gly Glu Gly Glu Val Arg Val Trp Gln Ile Gly Cys Gln 455 460 465 Thr Gln Lys Leu Glu Glu Ala Leu Lys Glu His Lys Ser Ser Val 470 475 480 Ser Cys Ile Arg Val Lys Arg Asn Asn Glu Glu Cys Val Thr Ala 485 490 495 Ser Thr Asp Gly Thr Cys Ile Ile Trp Asp Leu Val Arg Leu Arg 500 505 510 Arg Asn Gln Met Ile Leu Ala Asn Thr Leu Phe Gln Cys Val Cys 515 520 525 Tyr His Pro Glu Glu Phe Gln Ile Ile Thr Ser Gly Thr Asp Arg 530 535 540 Lys Ile Ala Tyr Trp Glu Val Phe Asp Gly Thr Val Ile Arg Glu 545 550 555 Leu Glu Gly Ser Leu Ser Gly Ser Ile Asn Gly Met Asp Ile Thr 560 565 570 Gln Glu Gly Val His Phe Val Thr Gly Gly Asn Asp His Leu Val 575 580 585 Lys Val Trp Asp Tyr Asn Glu Gly Glu Val Thr His Val Gly Val 590 595 600 Gly His Ser Gly Asn Ile Thr Arg Ile Arg Ile Ser Pro Gly Asn 605 610 615 Gln Tyr Ile Val Ser Val Ser Ala Asp Gly Ala Ile Leu Arg Trp 620 625 630 Lys Tyr Pro Tyr Thr Ser 635 8 593 PRT Homo sapiens misc_feature Incyte ID No 55140256CD1 8 Met Pro Leu Ser Val Ala Leu Gly Val Ala Ala Ile Asn Gln Ala 1 5 10 15 Ile Lys Glu Gly Lys Ala Ala Gln Thr Glu Arg Val Leu Arg Asn 20 25 30 Pro Ala Val Ala Leu Arg Gly Val Val Pro Asp Cys Ala Asn Gly 35 40 45 Tyr Gln Arg Ala Leu Glu Ser Ala Met Ala Lys Lys Gln Arg Pro 50 55 60 Gly Asn Gly Pro Asn Leu Thr Leu Leu Phe Ala Ala Cys Pro Ser 65 70 75 His Pro Leu Pro Ala Leu Asn Leu Phe Cys Leu Leu Gly Phe Cys 80 85 90 Leu Ala Asp Thr Ala Phe Trp Val Gln His Asp Met Lys Asp Gly 95 100 105 Thr Ala Tyr Tyr Phe His Leu Gln Thr Phe Gln Gly Ile Trp Glu 110 115 120 Gln Pro Pro Gly Cys Pro Leu Asn Thr Ser His Leu Thr Arg Glu 125 130 135 Glu Ile Gln Ser Thr Val Thr Lys Val Thr Ala Ala Tyr Asp Arg 140 145 150 Gln Gln Leu Trp Lys Ala Asn Val Gly Phe Val Ile Gln Leu Gln 155 160 165 Ala Arg Leu Arg Gly Phe Leu Val Arg Gln Lys Phe Ala Glu His 170 175 180 Ser His Phe Leu Arg Thr Trp Leu Pro Ala Val Ile Lys Ile Gln 185 190 195 Ala His Trp Arg Gly Tyr Arg Gln Arg Lys Ile Tyr Leu Glu Trp 200 205 210 Leu Gln Tyr Phe Lys Ala Asn Leu Asp Ala Ile Ile Lys Ile Gln 215 220 225 Ala Trp Ala Arg Met Trp Ala Ala Arg Arg Gln Tyr Leu Arg Arg 230 235 240 Leu His Tyr Phe Gln Lys Asn Val Asn Ser Ile Val Lys Ile Gln 245 250 255 Ala Phe Phe Arg Ala Arg Lys Ala Gln Asp Asp Tyr Arg Ile Leu 260 265 270 Val His Ala Pro His Pro Pro Leu Ser Val Val Arg Arg Phe Ala 275 280 285 His Leu Leu Asn Gln Ser Gln Gln Asp Phe Leu Ala Glu Ala Glu 290 295 300 Leu Leu Lys Leu Gln Glu Glu Val Val Arg Lys Ile Arg Ser Asn 305 310 315 Gln Gln Leu Glu Gln Asp Leu Asn Ile Met Asp Ile Lys Ile Gly 320 325 330 Leu Leu Val Lys Asn Arg Ile Thr Leu Gln Glu Val Val Ser His 335 340 345 Cys Lys Lys Leu Thr Lys Arg Asn Lys Glu Gln Leu Ser Asp Met 350 355 360 Met Val Leu Asp Lys Gln Lys Gly Leu Lys Ser Leu Ser Lys Glu 365 370 375 Lys Arg Gln Glu Leu Glu Ala Tyr Gln His Leu Phe Tyr Leu Leu 380 385 390 Gln Thr Gln Pro Ile Tyr Leu Ala Lys Leu Ile Phe Gln Met Pro 395 400 405 Gln Asn Lys Thr Thr Lys Leu Met Glu Ala Val Ile Phe Ser Leu 410 415 420 Tyr Asn Tyr Ala Ser Ser Arg Arg Glu Ala Tyr Leu Leu Leu Gln 425 430 435 Leu Phe Lys Thr Ala Leu Gln Glu Glu Ile Lys Ser Lys Val Glu 440 445 450 Gln Pro Gln Asp Val Val Thr Gly Asn Pro Thr Val Val Arg Leu 455 460 465 Val Val Arg Phe Tyr Arg Asn Gly Arg Gly Gln Ser Ala Leu Gln 470 475 480 Glu Ile Leu Gly Lys Val Ile Gln Asp Val Leu Glu Asp Lys Val 485 490 495 Leu Ser Val His Thr Asp Pro Val His Leu Tyr Lys Asn Trp Ile 500 505 510 Asn Gln Thr Glu Ala Gln Thr Gly Gln Arg Ser His Leu Pro Tyr 515 520 525 Asp Val Thr Pro Glu Gln Ala Leu Ser His Pro Glu Val Gln Arg 530 535 540 Arg Leu Asp Ile Ala Leu Arg Asn Leu Leu Ala Met Thr Asp Lys 545 550 555 Phe Leu Leu Ala Ile Thr Ser Ser Val Asp Gln Ile Pro Tyr Val 560 565 570 Gln Gln Pro His Pro Ser Leu Arg Met Gly Phe Gln Glu Gly Gln 575 580 585 Ala Thr Met Gly Ser Arg Ala Gly 590 9 1508 PRT Homo sapiens misc_feature Incyte ID No 2744344CD1 9 Met Gly Gln Glu Pro Arg Thr Leu Pro Pro Ser Pro Asn Trp Tyr 1 5 10 15 Cys Ala Arg Cys Ser Asp Ala Val Pro Gly Gly Leu Phe Gly Phe 20 25 30 Ala Ala Arg Thr Ser Val Phe Leu Val Arg Val Gly Pro Gly Ala 35 40 45 Gly Glu Ser Pro Gly Thr Pro Pro Phe Arg Val Ile Gly Glu Leu 50 55 60 Val Gly His Thr Glu Arg Val Ser Gly Phe Thr Phe Ser His His 65 70 75 Pro Gly Gln Tyr Asn Leu Cys Ala Thr Ser Ser Asp Asp Gly Thr 80 85 90 Val Lys Ile Trp Asp Val Glu Thr Lys Thr Val Val Thr Glu His 95 100 105 Ala Leu His Gln His Thr Ile Ser Thr Leu His Trp Ser Pro Arg 110 115 120 Val Lys Asp Leu Ile Val Ser Gly Asp Glu Lys Gly Val Val Phe 125 130 135 Cys Tyr Trp Phe Asn Arg Asn Asp Ser Gln His Leu Phe Ile Glu 140 145 150 Pro Arg Thr Ile Phe Cys Leu Thr Cys Ser Pro His His Glu Asp 155 160 165 Leu Val Ala Ile Gly Tyr Lys Asp Gly Ile Val Val Ile Ile Asp 170 175 180 Ile Ser Lys Lys Gly Glu Val Ile His Arg Leu Arg Gly His Asp 185 190 195 Asp Glu Ile His Ser Ile Ala Trp Cys Pro Leu Pro Gly Glu Asp 200 205 210 Cys Leu Ser Ile Asn Gln Glu Glu Thr Ser Glu Glu Ala Glu Ile 215 220 225 Thr Asn Gly Asn Ala Val Ala Gln Ala Pro Val Thr Lys Gly Cys 230 235 240 Tyr Leu Ala Thr Gly Ser Lys Asp Gln Thr Ile Arg Ile Trp Ser 245 250 255 Cys Ser Arg Gly Arg Gly Val Met Ile Leu Lys Leu Pro Phe Leu 260 265 270 Lys Arg Arg Gly Gly Gly Ile Asp Pro Thr Val Lys Glu Arg Leu 275 280 285 Trp Leu Thr Leu His Trp Pro Ser Asn Gln Pro Thr Gln Leu Val 290 295 300 Ser Ser Cys Phe Gly Gly Glu Leu Leu Gln Trp Asp Leu Thr Gln 305 310 315 Ser Trp Arg Arg Lys Tyr Thr Leu Phe Ser Ala Ser Ser Glu Gly 320 325 330 Gln Asn His Ser Arg Ile Val Phe Asn Leu Cys Pro Leu Gln Thr 335 340 345 Glu Asp Asp Lys Gln Leu Leu Leu Ser Thr Ser Met Asp Arg Asp 350 355 360 Val Lys Cys Trp Asp Ile Ala Thr Leu Glu Cys Ser Trp Thr Leu 365 370 375 Pro Ser Leu Gly Gly Phe Ala Tyr Ser Leu Ala Phe Ser Ser Val 380 385 390 Asp Ile Gly Ser Leu Ala Ile Gly Val Gly Asp Gly Met Ile Arg 395 400 405 Val Trp Asn Thr Leu Ser Ile Lys Asn Asn Tyr Asp Val Lys Asn 410 415 420 Phe Trp Gln Gly Val Lys Ser Lys Val Thr Ala Leu Cys Trp His 425 430 435 Pro Thr Lys Glu Gly Cys Leu Ala Phe Gly Thr Asp Asp Gly Lys 440 445 450 Val Gly Leu Tyr Asp Thr Tyr Ser Asn Lys Pro Pro Gln Ile Ser 455 460 465 Ser Thr Tyr His Lys Lys Thr Val Tyr Thr Leu Ala Trp Gly Pro 470 475 480 Pro Val Pro Pro Met Ser Leu Gly Gly Glu Gly Asp Arg Pro Ser 485 490 495 Leu Ala Leu Tyr Ser Cys Gly Gly Glu Gly Ile Val Leu Gln His 500 505 510 Asn Pro Trp Lys Leu Ser Gly Glu Ala Phe Asp Ile Asn Lys Leu 515 520 525 Ile Arg Asp Thr Asn Ser Ile Lys Tyr Lys Leu Pro Val His Thr 530 535 540 Glu Ile Ser Trp Lys Ala Asp Gly Lys Ile Met Ala Leu Gly Asn 545 550 555 Glu Asp Gly Ser Ile Glu Ile Phe Gln Ile Pro Asn Leu Lys Leu 560 565 570 Ile Cys Thr Ile Gln Gln His His Lys Leu Val Asn Thr Ile Ser 575 580 585 Trp His His Glu His Gly Ser Gln Pro Glu Leu Ser Tyr Leu Met 590 595 600 Ala Ser Gly Ser Asn Asn Ala Val Ile Tyr Val His Asn Leu Lys 605 610 615 Thr Val Ile Glu Ser Ser Pro Glu Ser Pro Val Thr Ile Thr Glu 620 625 630 Pro Tyr Arg Thr Leu Ser Gly His Thr Ala Lys Ile Thr Ser Val 635 640 645 Ala Trp Ser Pro His His Asp Gly Arg Leu Val Ser Ala Ser Tyr 650 655 660 Asp Gly Thr Ala Gln Val Trp Asp Ala Leu Arg Glu Glu Pro Leu 665 670 675 Cys Asn Phe Arg Gly His Gln Gly Arg Leu Leu Cys Val Ala Trp 680 685 690 Ser Pro Leu Asp Pro Asp Cys Ile Tyr Ser Gly Ala Asp Asp Phe 695 700 705 Cys Val His Lys Trp Leu Thr Ser Met Gln Asp His Ser Arg Pro 710 715 720 Pro Gln Gly Lys Lys Ser Ile Glu Leu Glu Lys Lys Arg Leu Ser 725 730 735 Gln Pro Lys Ala Lys Pro Lys Lys Lys Lys Lys Pro Thr Leu Arg 740 745 750 Thr Pro Val Lys Leu Glu Ser Ile Asp Gly Asn Glu Glu Glu Ser 755 760 765 Met Lys Glu Asn Ser Gly Pro Val Glu Asn Gly Val Ser Asp Gln 770 775 780 Glu Gly Glu Glu Gln Ala Arg Glu Pro Glu Leu Pro Cys Gly Leu 785 790 795 Ala Pro Ala Val Ser Arg Glu Pro Val Ile Cys Thr Pro Val Ser 800 805 810 Ser Gly Phe Glu Lys Ser Lys Val Thr Ile Asn Asn Lys Val Ile 815 820 825 Leu Leu Lys Lys Glu Pro Pro Lys Glu Lys Pro Glu Thr Leu Ile 830 835 840 Lys Lys Arg Lys Ala Arg Ser Leu Leu Pro Leu Ser Thr Ser Leu 845 850 855 Asp His Arg Ser Lys Glu Glu Leu His Gln Asp Cys Leu Val Leu 860 865 870 Ala Thr Ala Lys His Ser Arg Glu Leu Asn Glu Asp Val Ser Ala 875 880 885 Asp Val Glu Glu Arg Phe His Leu Gly Leu Phe Thr Asp Arg Ala 890 895 900 Thr Leu Tyr Arg Met Ile Asp Ile Glu Gly Lys Gly His Leu Glu 905 910 915 Asn Gly His Pro Glu Leu Phe His Gln Leu Met Leu Trp Lys Gly 920 925 930 Asp Leu Lys Gly Val Leu Gln Thr Ala Ala Glu Arg Gly Glu Leu 935 940 945 Thr Asp Asn Leu Val Ala Met Ala Pro Ala Ala Gly Tyr His Val 950 955 960 Trp Leu Trp Ala Val Glu Ala Phe Ala Lys Gln Leu Cys Phe Gln 965 970 975 Asp Gln Tyr Val Lys Ala Ala Ser His Leu Leu Ser Ile His Lys 980 985 990 Val Tyr Glu Ala Val Glu Leu Leu Lys Ser Asn His Phe Tyr Arg 995 1000 1005 Glu Ala Ile Ala Ile Ala Lys Ala Arg Leu Arg Pro Glu Asp Pro 1010 1015 1020 Val Leu Lys Asp Leu Tyr Leu Ser Trp Gly Thr Val Leu Glu Arg 1025 1030 1035 Asp Gly His Tyr Ala Val Ala Ala Lys Cys Tyr Leu Gly Ala Thr 1040 1045 1050 Cys Ala Tyr Asp Ala Ala Lys Val Leu Ala Lys Lys Gly Asp Ala 1055 1060 1065 Ala Ser Leu Arg Thr Ala Ala Glu Leu Ala Ala Ile Val Gly Glu 1070 1075 1080 Asp Glu Leu Ser Ala Ser Leu Ala Leu Arg Cys Ala Gln Glu Leu 1085 1090 1095 Leu Leu Ala Asn Asn Trp Val Gly Ala Gln Glu Ala Leu Gln Leu 1100 1105 1110 His Glu Ser Leu Gln Gly Gln Arg Leu Val Phe Cys Leu Leu Glu 1115 1120 1125 Leu Leu Ser Arg His Leu Glu Glu Lys Gln Leu Ser Glu Gly Lys 1130 1135 1140 Ser Ser Ser Ser Tyr His Thr Trp Asn Thr Gly Thr Glu Gly Pro 1145 1150 1155 Phe Val Glu Arg Val Thr Ala Val Trp Lys Ser Ile Phe Ser Leu 1160 1165 1170 Asp Thr Pro Glu Gln Tyr Gln Glu Ala Phe Gln Lys Leu Gln Asn 1175 1180 1185 Ile Lys Tyr Pro Ser Ala Thr Asn Asn Thr Pro Ala Lys Gln Leu 1190 1195 1200 Leu Leu His Ile Cys His Asp Leu Thr Leu Ala Val Leu Ser Gln 1205 1210 1215 Gln Met Ala Ser Trp Asp Glu Ala Val Gln Ala Leu Leu Arg Ala 1220 1225 1230 Val Val Arg Ser Tyr Asp Ser Gly Ser Phe Thr Ile Met Gln Glu 1235 1240 1245 Val Tyr Ser Ala Phe Leu Pro Asp Gly Cys Asp His Leu Arg Asp 1250 1255 1260 Lys Leu Gly Asp His Gln Ser Pro Ala Thr Pro Ala Phe Lys Ser 1265 1270 1275 Leu Glu Ala Phe Phe Leu Tyr Gly Arg Leu Tyr Glu Phe Trp Trp 1280 1285 1290 Ser Leu Ser Arg Pro Cys Pro Asn Ser Ser Val Trp Val Arg Ala 1295 1300 1305 Gly His Arg Thr Leu Ser Val Glu Pro Ser Gln Gln Leu Asp Thr 1310 1315 1320 Ala Ser Thr Glu Glu Thr Asp Pro Glu Thr Ser Gln Pro Glu Pro 1325 1330 1335 Asn Arg Pro Ser Glu Leu Asp Leu Arg Leu Thr Glu Glu Gly Glu 1340 1345 1350 Arg Met Leu Ser Thr Phe Lys Glu Leu Phe Ser Glu Lys His Ala 1355 1360 1365 Ser Leu Gln Asn Ser Gln Arg Thr Val Ala Glu Val Gln Glu Thr 1370 1375 1380 Leu Ala Glu Met Ile Arg Gln His Gln Lys Ser Gln Leu Cys Lys 1385 1390 1395 Ser Thr Ala Asn Gly Pro Asp Lys Asn Glu Pro Glu Val Glu Ala 1400 1405 1410 Glu Gln Pro Leu Cys Ser Ser Gln Ser Gln Cys Lys Glu Glu Lys 1415 1420 1425 Asn Glu Pro Leu Ser Leu Pro Glu Leu Thr Lys Arg Leu Thr Glu 1430 1435 1440 Ala Asn Gln Arg Met Ala Lys Phe Pro Glu Ser Ile Lys Ala Trp 1445 1450 1455 Pro Phe Pro Asp Val Leu Glu Cys Cys Leu Val Leu Leu Leu Ile 1460 1465 1470 Arg Ser His Phe Pro Gly Cys Leu Ala Gln Glu Met Gln Gln Gln 1475 1480 1485 Ala Gln Glu Leu Leu Gln Lys Tyr Gly Asn Thr Lys Thr Tyr Arg 1490 1495 1500 Arg His Cys Gln Thr Phe Cys Met 1505 10 404 PRT Homo sapiens misc_feature Incyte ID No 1555147CD1 10 Met Phe Phe Tyr Leu Ser Lys Lys Ile Ser Ile Pro Asn Asn Val 1 5 10 15 Lys Leu Gln Cys Val Ser Trp Asn Lys Glu Gln Gly Phe Ile Ala 20 25 30 Cys Gly Gly Glu Asp Gly Leu Leu Lys Val Leu Lys Leu Glu Thr 35 40 45 Gln Thr Asp Asp Ala Lys Leu Arg Gly Leu Ala Ala Pro Ser Asn 50 55 60 Leu Ser Met Asn Gln Thr Leu Glu Gly His Ser Gly Ser Val Gln 65 70 75 Val Val Thr Trp Asn Glu Gln Tyr Gln Lys Leu Thr Thr Ser Asp 80 85 90 Glu Asn Gly Leu Ile Ile Val Trp Met Leu Tyr Lys Gly Ser Trp 95 100 105 Ile Glu Glu Met Ile Asn Asn Arg Asn Lys Ser Val Val Arg Ser 110 115 120 Met Ser Trp Asn Ala Asp Gly Gln Lys Ile Cys Ile Val Tyr Glu 125 130 135 Asp Gly Ala Val Ile Val Gly Ser Val Asp Gly Asn Arg Ile Trp 140 145 150 Gly Lys Asp Leu Lys Gly Ile Gln Leu Ser His Val Thr Trp Ser 155 160 165 Ala Asp Ser Lys Val Leu Leu Phe Gly Met Ala Asn Gly Glu Ile 170 175 180 His Ile Tyr Asp Asn Gln Gly Asn Phe Met Ile Lys Met Lys Leu 185 190 195 Ser Cys Leu Val Asn Val Thr Gly Ala Ile Ser Ile Ala Gly Ile 200 205 210 His Trp Tyr His Gly Thr Glu Gly Tyr Val Glu Pro Asp Cys Pro 215 220 225 Cys Leu Ala Val Cys Phe Asp Asn Gly Arg Cys Gln Ile Met Arg 230 235 240 His Glu Asn Asp Gln Asn Pro Val Leu Ile Asp Thr Gly Met Tyr 245 250 255 Val Val Gly Ile Gln Trp Asn His Met Gly Ser Val Leu Ala Val 260 265 270 Ala Gly Phe Gln Lys Ala Ala Met Gln Asp Lys Asp Val Asn Ile 275 280 285 Val Gln Phe Tyr Thr Pro Phe Gly Glu His Leu Gly Thr Leu Lys 290 295 300 Val Pro Gly Lys Glu Ile Ser Ala Leu Ser Trp Glu Gly Gly Gly 305 310 315 Leu Lys Ile Ala Leu Ala Val Asp Ser Phe Ile Tyr Phe Ala Asn 320 325 330 Ile Arg Pro Asn Tyr Lys Trp Gly Tyr Cys Ser Asn Thr Val Val 335 340 345 Tyr Ala Tyr Thr Arg Pro Asp Arg Pro Glu Tyr Cys Val Val Phe 350 355 360 Trp Asp Thr Lys Asn Asn Glu Lys Tyr Val Lys Tyr Val Lys Gly 365 370 375 Leu Ile Ser Ile Thr Thr Cys Gly Asp Phe Cys Ile Leu Ala Thr 380 385 390 Lys Ala Asp Glu Asn His Pro Gln Tyr His Cys Leu Leu Gln 395 400 11 567 PRT Homo sapiens misc_feature Incyte ID No 1939136CD1 11 Met Val Thr Ala Arg Arg Arg Gly Ser Gly Cys Ala Arg Gly Arg 1 5 10 15 Ala Gly Arg Gly Gly Arg Ser Arg Gly Arg Gly Gln Gly Arg Leu 20 25 30 Arg Gly Phe Ser Arg Arg Arg Arg Gln Gly Glu Phe Pro Gly Ser 35 40 45 Gly His Ile Gly Ser Ile Gln Pro Gln Pro Pro Gly Arg Ser Ala 50 55 60 Ser Arg Ser Arg Leu Val Pro Val Ala Ala Pro Ala Leu Val Pro 65 70 75 Ala His Pro Pro Gly Ala Glu Leu Ala Met Ala Ala Thr Asp Leu 80 85 90 Glu Arg Phe Ser Asn Ala Glu Pro Glu Pro Arg Ser Leu Ser Leu 95 100 105 Gly Gly His Val Gly Phe Asp Ser Leu Pro Asp Gln Leu Val Ser 110 115 120 Lys Ser Val Thr Gln Gly Phe Ser Phe Asn Ile Leu Cys Val Gly 125 130 135 Glu Thr Gly Ile Gly Lys Ser Thr Leu Met Asn Thr Leu Phe Asn 140 145 150 Thr Thr Phe Glu Thr Glu Glu Ala Ser His His Glu Ala Cys Val 155 160 165 Arg Leu Arg Pro Gln Thr Tyr Asp Leu Gln Glu Ser Asn Val Gln 170 175 180 Leu Lys Leu Thr Ile Val Asp Ala Val Gly Phe Gly Asp Gln Ile 185 190 195 Asn Lys Asp Glu Ser Tyr Arg Pro Ile Val Asp Tyr Ile Asp Ala 200 205 210 Gln Phe Glu Asn Tyr Leu Gln Glu Glu Leu Lys Ile Arg Arg Ser 215 220 225 Leu Phe Asp Tyr His Asp Thr Arg Ile His Val Cys Leu Tyr Phe 230 235 240 Ile Thr Pro Thr Gly His Ser Leu Lys Ser Leu Asp Leu Val Thr 245 250 255 Met Lys Lys Leu Asp Ser Lys Val Asn Ile Ile Pro Ile Ile Ala 260 265 270 Lys Ala Asp Thr Ile Ser Lys Ser Glu Leu His Lys Phe Lys Ile 275 280 285 Lys Ile Met Gly Glu Leu Val Ser Asn Gly Val Gln Ile Tyr Gln 290 295 300 Phe Pro Thr Asp Asp Glu Ala Val Ala Glu Ile Asn Ala Val Met 305 310 315 Asn Ala His Leu Pro Phe Ala Val Val Gly Ser Thr Glu Glu Val 320 325 330 Lys Val Gly Asn Lys Leu Val Arg Ala Arg Gln Tyr Pro Trp Gly 335 340 345 Val Val Gln Val Glu Asn Glu Asn His Cys Asp Phe Val Lys Leu 350 355 360 Arg Glu Met Leu Ile Arg Val Asn Met Glu Asp Leu Arg Glu Gln 365 370 375 Thr His Ser Arg His Tyr Glu Leu Tyr Arg Arg Cys Lys Leu Glu 380 385 390 Glu Met Gly Phe Gln Asp Ser Asp Gly Asp Ser Gln Pro Phe Ser 395 400 405 Leu Gln Glu Thr Tyr Glu Ala Lys Arg Lys Glu Phe Leu Ser Glu 410 415 420 Leu Gln Arg Lys Glu Glu Glu Met Arg Gln Met Phe Val Asn Lys 425 430 435 Val Lys Glu Thr Glu Leu Glu Leu Lys Glu Lys Glu Arg Glu Leu 440 445 450 His Glu Lys Phe Glu His Leu Lys Arg Val His Gln Glu Glu Lys 455 460 465 Arg Lys Val Glu Glu Lys Arg Arg Glu Leu Glu Glu Glu Thr Asn 470 475 480 Ala Phe Asn Arg Arg Lys Ala Ala Val Glu Ala Leu Gln Ser Gln 485 490 495 Ala Leu His Ala Thr Ser Gln Gln Pro Leu Arg Lys Asp Lys Asp 500 505 510 Lys Lys Asn Arg Ser Asp Ile Gly Ala His Gln Pro Gly Met Ser 515 520 525 Leu Ser Ser Ser Lys Val Met Met Thr Lys Ala Ser Val Glu Pro 530 535 540 Leu Asn Cys Ser Ser Trp Trp Pro Ala Ile Gln Cys Cys Ser Cys 545 550 555 Leu Val Arg Asp Ala Thr Trp Arg Glu Gly Phe Leu 560 565 12 1120 PRT Homo sapiens misc_feature Incyte ID No 5956978CD1 12 Met Ala Arg Val Glu Ser Pro Val Pro Ala Ala Arg Ala Ser Leu 1 5 10 15 Thr Gly Ser Cys Val Leu Gly Gln Ala Met Pro Leu Arg Gly Gly 20 25 30 Ala Gly Pro Ser Pro Ala Ser His Gly Pro Thr His Gly Pro Ser 35 40 45 Asp Pro Arg Thr Cys Leu Pro Gly Arg Gly Ala Gly Gly Met Arg 50 55 60 Pro His Gly Arg Gly Ala Leu Gly Cys Cys Gly Leu Cys Ser Phe 65 70 75 Tyr Thr Cys His Gly Ala Ala Gly Asp Glu Ile Met His Gln Asp 80 85 90 Ile Val Pro Leu Cys Ala Ala Asp Ile Gln Asp Gln Leu Lys Lys 95 100 105 Arg Phe Ala Tyr Leu Ser Gly Gly Arg Gly Gln Asp Gly Ser Pro 110 115 120 Val Ile Thr Phe Pro Asp Tyr Pro Ala Phe Ser Glu Ile Pro Asp 125 130 135 Lys Glu Phe Gln Asn Val Met Thr Tyr Leu Thr Ser Ile Pro Ser 140 145 150 Leu Gln Asp Ala Gly Ile Gly Phe Ile Leu Val Ile Asp Arg Arg 155 160 165 Arg Asp Lys Trp Thr Ser Val Lys Ala Ser Val Leu Arg Ile Ala 170 175 180 Ala Ser Phe Pro Ala Asn Leu Gln Leu Val Leu Val Leu Arg Pro 185 190 195 Thr Gly Phe Phe Gln Arg Thr Leu Ser Asp Ile Ala Phe Lys Phe 200 205 210 Asn Arg Asp Asp Phe Lys Met Lys Val Pro Val Ile Met Leu Ser 215 220 225 Ser Val Pro Asp Leu His Gly Tyr Ile Asp Lys Ser Gln Leu Thr 230 235 240 Glu Asp Leu Gly Gly Thr Leu Asp Tyr Cys His Ser Arg Trp Leu 245 250 255 Cys Gln Arg Thr Ala Ile Glu Ser Phe Ala Leu Met Val Lys Gln 260 265 270 Thr Ala Gln Met Leu Gln Ser Phe Gly Thr Glu Leu Ala Glu Thr 275 280 285 Glu Leu Pro Asn Asp Val Gln Ser Thr Ser Ser Val Leu Cys Ala 290 295 300 His Thr Glu Lys Lys Asp Lys Ala Lys Glu Asp Leu Arg Leu Ala 305 310 315 Leu Lys Glu Gly His Ser Val Leu Glu Ser Leu Arg Glu Leu Gln 320 325 330 Ala Glu Gly Ser Glu Pro Ser Val Asn Gln Asp Gln Leu Asp Asn 335 340 345 Gln Ala Thr Val Gln Arg Leu Leu Ala Gln Leu Asn Glu Thr Glu 350 355 360 Ala Ala Phe Asp Glu Phe Trp Ala Lys His Gln Gln Lys Leu Glu 365 370 375 Gln Cys Leu Gln Leu Arg His Phe Glu Gln Gly Phe Arg Glu Val 380 385 390 Lys Ala Ile Leu Asp Ala Ala Ser Gln Lys Ile Ala Thr Phe Thr 395 400 405 Asp Ile Gly Asn Ser Leu Ala His Val Glu His Leu Leu Arg Asp 410 415 420 Leu Ala Ser Phe Glu Glu Lys Ser Gly Val Ala Val Glu Arg Ala 425 430 435 Arg Ala Leu Ser Leu Asp Gly Glu Gln Leu Ile Gly Asn Lys His 440 445 450 Tyr Ala Val Asp Ser Ile Arg Pro Lys Cys Gln Glu Leu Arg His 455 460 465 Leu Cys Asp Gln Phe Ser Ala Glu Ile Ala Arg Arg Arg Gly Leu 470 475 480 Leu Ser Lys Ser Leu Glu Leu His Arg Arg Leu Glu Thr Ser Met 485 490 495 Lys Trp Cys Asp Glu Gly Ile Tyr Leu Leu Ala Ser Gln Pro Val 500 505 510 Asp Lys Cys Gln Ser Gln Asp Gly Ala Glu Ala Ala Leu Gln Glu 515 520 525 Ile Glu Lys Phe Leu Glu Thr Gly Ala Glu Asn Lys Ile Gln Glu 530 535 540 Leu Asn Ala Ile Tyr Lys Glu Tyr Glu Ser Ile Leu Asn Gln Asp 545 550 555 Leu Met Glu His Val Arg Lys Val Phe Gln Lys Gln Ala Ser Met 560 565 570 Glu Glu Val Phe His Arg Arg Gln Ala Ser Leu Lys Lys Leu Ala 575 580 585 Ala Arg Gln Thr Arg Pro Val Gln Pro Val Ala Pro Arg Pro Glu 590 595 600 Ala Leu Ala Lys Ser Pro Cys Pro Ser Pro Gly Ile Arg Arg Gly 605 610 615 Ser Glu Asn Ser Ser Ser Glu Gly Gly Ala Leu Arg Arg Gly Pro 620 625 630 Tyr Arg Arg Ala Lys Ser Glu Met Ser Glu Ser Arg Gln Gly Arg 635 640 645 Gly Ser Ala Gly Glu Glu Glu Glu Ser Leu Ala Ile Leu Arg Arg 650 655 660 His Val Met Ser Glu Leu Leu Asp Thr Glu Arg Ala Tyr Val Glu 665 670 675 Glu Leu Leu Cys Val Leu Glu Gly Tyr Ala Ala Glu Met Asp Asn 680 685 690 Pro Leu Met Ala His Leu Leu Ser Thr Gly Leu His Asn Lys Lys 695 700 705 Asp Val Leu Phe Gly Asn Met Glu Glu Ile Tyr His Phe His Asn 710 715 720 Arg Ile Phe Leu Arg Glu Leu Glu Asn Tyr Thr Asp Cys Pro Glu 725 730 735 Leu Val Gly Arg Cys Phe Leu Glu Arg Met Glu Asp Phe Gln Ile 740 745 750 Tyr Glu Lys Tyr Cys Gln Asn Lys Pro Arg Ser Glu Ser Leu Trp 755 760 765 Arg Gln Cys Ser Asp Cys Pro Phe Phe Gln Glu Cys Gln Arg Lys 770 775 780 Leu Asp His Lys Leu Ser Leu Asp Ser Tyr Leu Leu Lys Pro Val 785 790 795 Gln Arg Ile Thr Lys Tyr Gln Leu Leu Leu Lys Glu Met Leu Lys 800 805 810 Tyr Ser Arg Asn Cys Glu Gly Ala Glu Asp Leu Gln Glu Ala Leu 815 820 825 Ser Ser Ile Leu Gly Ile Leu Lys Ala Val Asn Asp Ser Met His 830 835 840 Leu Ile Ala Ile Thr Gly Tyr Asp Gly Asn Leu Gly Asp Leu Gly 845 850 855 Lys Leu Leu Met Gln Gly Ser Phe Ser Val Trp Thr Asp His Lys 860 865 870 Arg Gly His Thr Lys Val Lys Glu Leu Ala Arg Phe Lys Pro Met 875 880 885 Gln Arg His Leu Phe Leu His Glu Lys Ala Val Leu Phe Cys Lys 890 895 900 Lys Arg Glu Glu Asn Gly Glu Gly Tyr Glu Lys Ala Pro Ser Tyr 905 910 915 Ser Tyr Lys Gln Ser Leu Asn Met Ala Ala Val Gly Ile Thr Glu 920 925 930 Asn Val Lys Gly Asp Ala Lys Lys Phe Glu Ile Trp Tyr Asn Ala 935 940 945 Arg Glu Glu Val Tyr Ile Val Gln Ala Pro Thr Pro Glu Ile Lys 950 955 960 Ala Ala Trp Val Asn Glu Ile Arg Lys Val Leu Thr Ser Gln Leu 965 970 975 Gln Ala Cys Arg Glu Ala Ser Gln His Arg Ala Leu Glu Gln Ser 980 985 990 Gln Ser Leu Pro Leu Pro Ala Pro Thr Ser Thr Ser Pro Ser Arg 995 1000 1005 Gly Asn Ser Arg Asn Ile Lys Lys Leu Glu Glu Arg Lys Thr Asp 1010 1015 1020 Pro Leu Ser Leu Glu Gly Tyr Val Ser Ser Ala Pro Leu Thr Lys 1025 1030 1035 Pro Pro Glu Lys Gly Lys Gly Trp Ser Lys Thr Ser His Ser Leu 1040 1045 1050 Glu Ala Pro Glu Asp Asp Gly Gly Trp Ser Ser Ala Glu Glu Gln 1055 1060 1065 Ile Asn Ser Ser Asp Ala Glu Glu Asp Gly Gly Leu Gly Pro Lys 1070 1075 1080 Lys Leu Val Pro Gly Lys Tyr Thr Val Val Ala Asp His Glu Lys 1085 1090 1095 Gly Gly Pro Asp Ala Leu Arg Val Arg Ser Gly Asp Val Val Glu 1100 1105 1110 Leu Val Gln Glu Gly Asp Glu Gly Leu Trp 1115 1120 13 244 PRT Homo sapiens misc_feature Incyte ID No 7662817CD1 13 Met Asp Pro Gly Ala Ala Leu Gln Arg Arg Ala Gly Gly Gly Gly 1 5 10 15 Gly Leu Gly Ala Gly Ser Pro Ala Leu Ser Gly Gly Gln Gly Arg 20 25 30 Arg Lys Lys Gln Pro Pro Arg Pro Ala Asp Phe Lys Leu Gln Val 35 40 45 Ile Ile Ile Gly Ser Arg Gly Val Gly Lys Thr Ser Leu Met Glu 50 55 60 Arg Phe Thr Asp Asp Thr Phe Cys Glu Ala Cys Lys Ser Thr Val 65 70 75 Gly Val Asp Phe Lys Ile Lys Thr Val Glu Leu Arg Gly Lys Lys 80 85 90 Ile Arg Leu Gln Ile Trp Asp Thr Ala Gly Gln Glu Arg Phe Asn 95 100 105 Ser Ile Thr Ser Ala Tyr Tyr Arg Ser Ala Lys Gly Ile Ile Leu 110 115 120 Val Tyr Asp Ile Thr Lys Lys Glu Thr Phe Asp Asp Leu Pro Lys 125 130 135 Trp Met Lys Met Ile Asp Lys Tyr Ala Ser Glu Asp Ala Glu Leu 140 145 150 Leu Leu Val Gly Asn Lys Leu Asp Cys Glu Thr Asp Arg Glu Ile 155 160 165 Thr Arg Gln Gln Gly Glu Lys Phe Ala Gln Gln Ile Thr Gly Met 170 175 180 Arg Phe Cys Glu Ala Ser Ala Lys Asp Asn Phe Asn Val Asp Glu 185 190 195 Ile Phe Leu Lys Leu Val Asp Asp Ile Leu Lys Lys Met Pro Leu 200 205 210 Asp Ile Leu Arg Asn Glu Leu Ser Asn Ser Ile Leu Ser Leu Gln 215 220 225 Pro Glu Pro Glu Ile Pro Pro Glu Leu Pro Pro Pro Arg Pro His 230 235 240 Val Arg Cys Cys 14 1251 PRT Homo sapiens misc_feature Incyte ID No 55139221CD1 14 Met Ala Arg Gly Asp Ala Gly Arg Gly Arg Gly Leu Leu Ala Leu 1 5 10 15 Thr Phe Cys Leu Leu Ala Ala Arg Gly Glu Leu Leu Leu Pro Gln 20 25 30 Glu Thr Thr Val Glu Leu Ser Cys Gly Val Gly Pro Leu Gln Val 35 40 45 Ile Leu Gly Pro Glu Gln Ala Ala Val Leu Asn Cys Ser Leu Gly 50 55 60 Ala Ala Ala Ala Gly Pro Pro Thr Arg Val Thr Trp Ser Lys Asp 65 70 75 Gly Asp Thr Leu Leu Glu His Asp His Leu His Leu Leu Pro Asn 80 85 90 Gly Ser Leu Trp Leu Ser Gln Pro Leu Ala Pro Asn Gly Ser Asp 95 100 105 Glu Ser Val Pro Glu Ala Val Gly Val Ile Glu Gly Asn Tyr Ser 110 115 120 Cys Leu Ala His Gly Pro Leu Gly Val Leu Ala Ser Gln Thr Ala 125 130 135 Val Val Lys Leu Ala Ser Leu Ala Asp Phe Ser Leu His Pro Glu 140 145 150 Ser Gln Thr Val Glu Glu Asn Gly Thr Ala Arg Phe Glu Cys His 155 160 165 Ile Glu Gly Leu Pro Ala Pro Ile Ile Thr Trp Glu Lys Asp Gln 170 175 180 Val Thr Leu Pro Glu Glu Pro Arg Arg Leu Ile Val Leu Pro Asn 185 190 195 Gly Val Leu Gln Ile Leu Asp Val Gln Glu Ser Asp Ala Gly Pro 200 205 210 Tyr Arg Cys Val Ala Thr Asn Ser Ala Arg Gln His Phe Ser Gln 215 220 225 Glu Ala Leu Leu Ser Val Ala His Arg Gly Ser Leu Ala Ser Thr 230 235 240 Arg Gly Gln Asp Val Val Ile Val Ala Ala Pro Glu Asn Thr Thr 245 250 255 Val Val Ser Gly Gln Ser Val Val Met Glu Cys Val Ala Ser Ala 260 265 270 Asp Pro Thr Pro Phe Val Ser Trp Val Arg Gln Asp Gly Lys Pro 275 280 285 Ile Ser Thr Asp Val Ile Val Leu Gly Arg Thr Asn Leu Leu Ile 290 295 300 Ala Asn Ala Gln Pro Trp His Ser Gly Val Tyr Val Cys Arg Ala 305 310 315 Asn Lys Pro Arg Thr Arg Asp Phe Ala Thr Ala Ala Ala Glu Leu 320 325 330 Arg Val Leu Ala Ala Pro Ala Ile Thr Gln Ala Pro Glu Ala Leu 335 340 345 Ser Arg Thr Arg Ala Ser Thr Ala Arg Phe Val Cys Arg Ala Ser 350 355 360 Gly Glu Pro Arg Pro Ala Leu Arg Trp Leu His Asn Gly Ala Pro 365 370 375 Leu Arg Pro Asn Gly Arg Val Lys Val Gln Gly Gly Gly Gly Ser 380 385 390 Leu Val Ile Thr Gln Ile Gly Leu Gln Asp Ala Gly Tyr Tyr Gln 395 400 405 Cys Val Ala Glu Asn Ser Ala Gly Met Ala Cys Ala Ala Ala Ser 410 415 420 Leu Ala Val Val Val Arg Glu Gly Leu Pro Ser Ala Pro Thr Arg 425 430 435 Val Thr Ala Thr Pro Leu Ser Ser Ser Ala Val Leu Val Ala Trp 440 445 450 Glu Arg Pro Glu Met His Ser Glu Gln Ile Ile Gly Phe Ser Leu 455 460 465 His Tyr Gln Lys Ala Arg Gly Met Asp Asn Val Glu Tyr Gln Phe 470 475 480 Ala Val Asn Asn Asp Thr Thr Glu Leu Gln Val Arg Asp Leu Glu 485 490 495 Pro Asn Thr Asp Tyr Glu Phe Tyr Val Val Ala Tyr Ser Gln Leu 500 505 510 Gly Ala Ser Arg Thr Ser Thr Pro Ala Leu Val His Thr Leu Asp 515 520 525 Asp Val Pro Ser Ala Ala Pro Gln Leu Ser Leu Ser Ser Pro Asn 530 535 540 Pro Ser Asp Ile Arg Val Ala Trp Leu Pro Leu Pro Pro Ser Leu 545 550 555 Ser Asn Gly Gln Val Val Lys Tyr Lys Ile Glu Tyr Gly Leu Gly 560 565 570 Lys Glu Asp Gln Ile Phe Ser Thr Glu Val Arg Gly Asn Glu Thr 575 580 585 Gln Leu Met Leu Asn Ser Leu Gln Pro Asn Lys Val Tyr Arg Val 590 595 600 Arg Ile Ser Ala Gly Thr Ala Ala Gly Phe Gly Ala Pro Ser Gln 605 610 615 Trp Met His His Arg Thr Pro Ser Met His Asn Gln Ser His Val 620 625 630 Pro Phe Ala Pro Ala Glu Leu Lys Val Gln Ala Lys Met Glu Ser 635 640 645 Leu Val Val Ser Trp Gln Pro Pro Pro His Pro Thr Gln Ile Ser 650 655 660 Gly Tyr Lys Leu Tyr Trp Arg Glu Val Gly Ala Glu Glu Glu Ala 665 670 675 Asn Gly Asp Arg Leu Pro Gly Gly Arg Gly Asp Gln Ala Trp Asp 680 685 690 Val Gly Pro Val Arg Leu Lys Lys Lys Val Lys Gln Tyr Glu Leu 695 700 705 Thr Gln Leu Val Pro Gly Arg Leu Tyr Glu Val Lys Leu Val Ala 710 715 720 Phe Asn Lys His Glu Asp Gly Tyr Ala Ala Val Trp Lys Gly Lys 725 730 735 Thr Glu Lys Ala Pro Ala Pro Asp Met Pro Ile Gln Arg Gly Pro 740 745 750 Pro Leu Pro Pro Ala His Val His Ala Glu Ser Asn Ser Ser Thr 755 760 765 Ser Ile Trp Leu Arg Trp Lys Lys Pro Asp Phe Thr Thr Val Lys 770 775 780 Ile Val Asn Tyr Thr Val Arg Phe Ser Pro Trp Gly Leu Arg Asn 785 790 795 Ala Ser Leu Val Thr Tyr Tyr Thr Ser Ser Gly Glu Asp Ile Leu 800 805 810 Ile Gly Gly Leu Lys Pro Phe Thr Lys Tyr Glu Phe Ala Val Gln 815 820 825 Ser His Gly Val Asp Met Asp Gly Pro Phe Gly Ser Val Val Glu 830 835 840 Arg Ser Thr Leu Pro Asp Arg Pro Ser Thr Pro Pro Ser Asp Leu 845 850 855 Arg Leu Ser Pro Leu Thr Pro Ser Thr Val Arg Leu His Trp Cys 860 865 870 Pro Pro Thr Glu Pro Asn Gly Glu Ile Val Glu Tyr Leu Ile Leu 875 880 885 Tyr Ser Ser Asn His Thr Gln Pro Glu His Gln Trp Thr Leu Leu 890 895 900 Thr Thr Gln Gly Asn Ile Phe Ser Ala Glu Val His Gly Leu Glu 905 910 915 Ser Asp Thr Arg Tyr Phe Phe Lys Met Gly Ala Arg Thr Glu Val 920 925 930 Gly Pro Gly Pro Phe Ser Arg Leu Gln Asp Val Ile Thr Leu Gln 935 940 945 Glu Lys Leu Ser Asp Ser Leu Asp Met His Ser Val Thr Gly Ile 950 955 960 Ile Val Gly Val Cys Leu Gly Leu Leu Cys Leu Leu Ala Cys Met 965 970 975 Cys Ala Gly Leu Arg Arg Ser Pro His Arg Glu Ser Leu Pro Gly 980 985 990 Leu Ser Ser Thr Ala Thr Pro Gly Asn Pro Ala Leu Tyr Ser Arg 995 1000 1005 Ala Arg Leu Gly Pro Pro Ser Pro Pro Ala Ala His Glu Leu Glu 1010 1015 1020 Ser Leu Val His Pro His Pro Gln Asp Trp Ser Pro Pro Pro Ser 1025 1030 1035 Asp Val Glu Asp Arg Ala Glu Val His Ser Leu Met Gly Gly Gly 1040 1045 1050 Val Ser Glu Gly Arg Ser His Ser Lys Arg Lys Ile Ser Trp Ala 1055 1060 1065 Gln Pro Ser Gly Leu Ser Trp Ala Gly Ser Trp Ala Gly Cys Glu 1070 1075 1080 Leu Pro Gln Ala Gly Pro Arg Pro Ala Leu Thr Arg Ala Leu Leu 1085 1090 1095 Pro Pro Ala Gly Thr Gly Gln Thr Leu Leu Leu Gln Ala Leu Val 1100 1105 1110 Tyr Asp Ala Ile Lys Gly Asn Gly Arg Lys Lys Ser Pro Pro Ala 1115 1120 1125 Cys Arg Asn Gln Val Glu Ala Glu Val Ile Val His Ser Asp Phe 1130 1135 1140 Ser Ala Ser Asn Gly Asn Pro Asp Leu His Leu Gln Asp Leu Glu 1145 1150 1155 Pro Glu Asp Pro Leu Pro Pro Glu Ala Pro Asp Leu Ile Ser Gly 1160 1165 1170 Val Gly Asp Pro Gly Gln Gly Ala Ala Trp Leu Asp Arg Glu Leu 1175 1180 1185 Gly Gly Cys Glu Leu Ala Ala Pro Gly Pro Asp Arg Leu Thr Cys 1190 1195 1200 Leu Pro Glu Ala Ala Ser Ala Ser Cys Ser Tyr Pro Asp Leu Gln 1205 1210 1215 Pro Gly Glu Val Leu Glu Glu Thr Pro Gly Asp Ser Cys Gln Leu 1220 1225 1230 Lys Ser Pro Cys Pro Leu Gly Ala Ser Pro Gly Leu Pro Arg Ser 1235 1240 1245 Pro Val Ser Ser Ser Ala 1250 15 238 PRT Homo sapiens misc_feature Incyte ID No 7493736CD1 15 Met Ser Ser Gly Tyr Ser Ser Leu Glu Glu Asp Ala Glu Asp Phe 1 5 10 15 Phe Phe Thr Ala Arg Thr Ser Phe Phe Arg Arg Ala Pro Gln Gly 20 25 30 Lys Pro Arg Ser Gly Gln Gln Asp Val Glu Lys Glu Lys Glu Thr 35 40 45 His Ser Tyr Leu Ser Lys Glu Glu Ile Lys Glu Lys Val His Lys 50 55 60 Tyr Asn Leu Ala Val Thr Asp Lys Leu Lys Met Thr Leu Asn Ser 65 70 75 Asn Gly Ile Tyr Thr Gly Phe Ile Lys Val Gln Met Glu Leu Cys 80 85 90 Lys Pro Pro Gln Thr Ser Pro Asn Ser Gly Lys Leu Ser Pro Ser 95 100 105 Ser Asn Gly Cys Met Asn Thr Leu His Ile Ser Ser Thr Asn Thr 110 115 120 Val Gly Glu Val Ile Glu Ala Leu Leu Lys Lys Phe Leu Val Thr 125 130 135 Glu Ser Pro Ala Lys Phe Ala Leu Tyr Lys Arg Cys His Arg Glu 140 145 150 Asp Gln Val Tyr Ala Cys Lys Leu Ser Asp Arg Glu His Pro Leu 155 160 165 Tyr Leu Arg Leu Val Ala Gly Pro Arg Thr Asp Thr Leu Ser Phe 170 175 180 Val Leu Arg Glu His Glu Ile Gly Glu Trp Glu Ala Phe Ser Leu 185 190 195 Pro Glu Leu Gln Asn Phe Leu Arg Ile Leu Asp Lys Glu Glu Asp 200 205 210 Glu Gln Leu Gln Asn Leu Lys Arg Arg Tyr Thr Ala Tyr Arg Gln 215 220 225 Lys Leu Glu Glu Ala Leu Arg Glu Val Trp Lys Pro Asp 230 235 16 1082 PRT Homo sapiens misc_feature Incyte ID No 4614878CD1 16 Met Asp Ser Gln Gln Thr Asp Phe Arg Ala His Asn Val Pro Leu 1 5 10 15 Lys Leu Pro Met Pro Glu Pro Gly Glu Leu Glu Glu Arg Phe Ala 20 25 30 Ile Val Leu Asn Ala Met Asn Leu Pro Pro Asp Lys Ala Arg Leu 35 40 45 Leu Arg Gln Tyr Asp Asn Glu Lys Lys Trp Glu Leu Ile Cys Asp 50 55 60 Gln Glu Arg Phe Gln Val Lys Asn Pro Pro His Thr Tyr Ile Gln 65 70 75 Lys Leu Lys Gly Tyr Leu Asp Pro Ala Val Thr Arg Lys Lys Phe 80 85 90 Arg Arg Arg Val Gln Glu Ser Thr Gln Val Leu Arg Glu Leu Glu 95 100 105 Ile Ser Leu Arg Thr Asn His Ile Gly Trp Val Arg Glu Phe Leu 110 115 120 Asn Glu Glu Asn Lys Gly Leu Asp Val Leu Val Glu Tyr Leu Ser 125 130 135 Phe Ala Gln Tyr Ala Val Thr Phe Asp Phe Glu Ser Val Glu Ser 140 145 150 Thr Val Glu Ser Ser Val Asp Lys Ser Lys Pro Trp Ser Arg Ser 155 160 165 Ile Glu Asp Leu His Arg Gly Ser Asn Leu Pro Ser Pro Val Gly 170 175 180 Asn Ser Val Ser Arg Ser Gly Arg His Ser Ala Leu Arg Tyr Asn 185 190 195 Thr Leu Pro Ser Arg Arg Thr Leu Lys Asn Ser Arg Leu Val Ser 200 205 210 Lys Lys Asp Asp Val His Val Cys Ile Met Cys Leu Arg Ala Ile 215 220 225 Met Asn Tyr Gln Tyr Gly Phe Asn Met Val Met Ser His Pro His 230 235 240 Ala Val Asn Glu Ile Ala Leu Ser Leu Asn Asn Lys Asn Pro Arg 245 250 255 Thr Lys Ala Leu Val Leu Glu Leu Leu Ala Ala Val Cys Leu Val 260 265 270 Arg Gly Gly His Glu Ile Ile Leu Ser Ala Phe Asp Asn Phe Lys 275 280 285 Glu Val Cys Gly Glu Lys Gln Arg Phe Glu Lys Leu Met Glu His 290 295 300 Phe Arg Asn Glu Asp Asn Asn Ile Asp Phe Met Val Ala Ser Met 305 310 315 Gln Phe Ile Asn Ile Val Val His Ser Val Glu Asp Met Asn Phe 320 325 330 Arg Val His Leu Gln Tyr Glu Phe Thr Lys Leu Gly Leu Asp Glu 335 340 345 Tyr Leu Asp Lys Leu Lys His Thr Glu Ser Asp Lys Leu Gln Val 350 355 360 Gln Ile Gln Ala Tyr Leu Asp Asn Val Phe Asp Val Gly Ala Leu 365 370 375 Leu Glu Asp Ala Glu Thr Lys Asn Ala Ala Leu Glu Arg Val Glu 380 385 390 Glu Leu Glu Glu Asn Ile Ser His Leu Ser Glu Lys Leu Gln Asp 395 400 405 Thr Glu Asn Glu Ala Met Ser Lys Ile Val Glu Leu Glu Lys Gln 410 415 420 Leu Met Gln Arg Asn Lys Glu Leu Asp Val Val Arg Glu Ile Tyr 425 430 435 Lys Asp Ala Asn Thr Gln Val His Thr Leu Arg Lys Met Val Lys 440 445 450 Glu Lys Glu Glu Ala Ile Gln Arg Gln Ser Thr Leu Glu Lys Lys 455 460 465 Ile His Glu Leu Glu Lys Gln Gly Thr Ile Lys Ile Gln Lys Lys 470 475 480 Gly Asp Gly Asp Ile Ala Ile Leu Pro Val Val Ala Ser Gly Thr 485 490 495 Leu Ser Met Gly Ser Glu Val Val Ala Gly Asn Ser Val Gly Pro 500 505 510 Thr Met Gly Ala Ala Ser Ser Gly Pro Leu Pro Pro Pro Pro Pro 515 520 525 Pro Leu Pro Pro Ser Ser Asp Thr Pro Glu Thr Val Gln Asn Gly 530 535 540 Pro Val Thr Pro Pro Met Pro Pro Pro Pro Pro Pro Pro Pro Pro 545 550 555 Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Pro Leu Pro 560 565 570 Gly Pro Ala Ala Glu Thr Val Pro Ala Pro Pro Leu Ala Pro Pro 575 580 585 Leu Pro Ser Ala Pro Pro Leu Pro Gly Thr Ser Ser Pro Thr Val 590 595 600 Val Phe Asn Ser Gly Leu Ala Ala Val Lys Ile Lys Lys Pro Ile 605 610 615 Lys Thr Lys Phe Arg Met Pro Val Phe Asn Trp Val Ala Leu Lys 620 625 630 Pro Asn Gln Ile Asn Gly Thr Val Phe Asn Glu Ile Asp Asp Glu 635 640 645 Arg Ile Leu Glu Asp Leu Asn Val Asp Glu Phe Glu Glu Ile Phe 650 655 660 Lys Thr Lys Ala Gln Gly Pro Ala Ile Asp Leu Ser Ser Ser Lys 665 670 675 Gln Lys Ile Pro Gln Lys Gly Ser Asn Lys Val Thr Leu Leu Glu 680 685 690 Ala Asn Arg Ala Lys Asn Leu Ala Ile Thr Leu Arg Lys Ala Gly 695 700 705 Lys Thr Ala Asp Glu Ile Cys Lys Ala Ile His Val Phe Asp Leu 710 715 720 Lys Thr Leu Pro Val Asp Phe Val Glu Cys Leu Met Arg Phe Leu 725 730 735 Pro Thr Glu Asn Glu Val Lys Val Leu Arg Leu Tyr Glu Arg Glu 740 745 750 Arg Lys Pro Leu Glu Asn Leu Ser Asp Glu Asp Arg Phe Met Met 755 760 765 Gln Phe Ser Lys Ile Glu Arg Leu Met Gln Lys Met Thr Ile Met 770 775 780 Ala Phe Ile Gly Asn Phe Ala Glu Ser Ile Gln Met Leu Thr Pro 785 790 795 Gln Leu His Ala Ile Ile Ala Ala Ser Val Ser Ile Lys Ser Ser 800 805 810 Gln Lys Leu Lys Lys Ile Leu Glu Ile Ile Leu Ala Leu Gly Asn 815 820 825 Tyr Met Asn Ser Ser Lys Arg Gly Ala Val Tyr Gly Phe Lys Leu 830 835 840 Gln Ser Leu Asp Leu Leu Leu Asp Thr Lys Ser Thr Asp Arg Lys 845 850 855 Gln Thr Leu Leu His Tyr Ile Ser Asn Val Val Lys Glu Lys Tyr 860 865 870 His Gln Val Ser Leu Phe Tyr Asn Glu Leu His Tyr Val Glu Lys 875 880 885 Ala Ala Ala Val Ser Leu Glu Asn Val Leu Leu Asp Val Lys Glu 890 895 900 Leu Gln Arg Gly Met Asp Leu Thr Lys Arg Glu Tyr Thr Met His 905 910 915 Asp His Asn Thr Leu Leu Lys Glu Phe Ile Leu Asn Asn Glu Gly 920 925 930 Lys Leu Lys Lys Leu Gln Asp Asp Ala Lys Ile Ala Gln Asp Ala 935 940 945 Phe Asp Asp Val Val Lys Tyr Phe Gly Glu Asn Pro Lys Thr Thr 950 955 960 Pro Pro Ser Val Phe Phe Pro Val Phe Val Arg Phe Val Lys Ala 965 970 975 Tyr Lys Gln Ala Glu Glu Glu Asn Glu Leu Arg Lys Lys Gln Glu 980 985 990 Gln Ala Leu Met Glu Lys Leu Leu Glu Gln Glu Ala Leu Met Glu 995 1000 1005 Gln Gln Asp Pro Lys Ser Pro Ser His Lys Ser Lys Arg Gln Gln 1010 1015 1020 Gln Glu Leu Ile Ala Glu Leu Arg Arg Arg Gln Val Lys Asp Asn 1025 1030 1035 Arg His Val Tyr Glu Gly Lys Asp Gly Ala Ile Glu Asp Ile Ile 1040 1045 1050 Thr Ala Leu Lys Lys Asn Asn Ile Thr Lys Phe Pro Asn Val His 1055 1060 1065 Ser Arg Val Arg Ile Ser Ser Ser Thr Pro Val Val Glu Asp Thr 1070 1075 1080 Gln Ser 17 428 PRT Homo sapiens misc_feature Incyte ID No 7498437CD1 17 Met Phe Ser Leu Met Ala Ile Cys Cys Gly Trp Phe Lys Arg Arg 1 5 10 15 Arg Glu Pro Val Arg Lys Val Thr Leu Leu Met Val Gly Leu Asp 20 25 30 Asn Ala Gly Lys Thr Ala Thr Ala Lys Gly Ile Gln Gly Glu Tyr 35 40 45 Pro Glu Asp Val Ala Pro Thr Val Gly Phe Ser Lys Ile Asn Leu 50 55 60 Arg Gln Gly Lys Phe Glu Val Thr Ile Phe Asp Leu Gly Gly Gly 65 70 75 Ile Arg Ile Arg Gly Ile Trp Lys Asn Tyr Tyr Ala Glu Ser Tyr 80 85 90 Gly Val Ile Phe Val Val Asp Ser Ser Asp Glu Glu Arg Met Glu 95 100 105 Glu Thr Lys Glu Ala Met Ser Glu Met Leu Arg His Pro Arg Ile 110 115 120 Ser Gly Lys Pro Ile Leu Val Leu Ala Asn Lys Gln Asp Lys Glu 125 130 135 Gly Ala Leu Gly Glu Ala Asp Val Ile Glu Cys Leu Ser Leu Glu 140 145 150 Lys Leu Val Asn Glu His Lys Cys Leu Cys Gln Ile Glu Pro Cys 155 160 165 Ser Ala Ile Ser Gly Tyr Gly Lys Lys Ile Asp Lys Ser Ile Lys 170 175 180 Lys Gly Leu Tyr Trp Leu Leu His Val Ile Ala Arg Asp Phe Asp 185 190 195 Ala Leu Asn Glu Arg Ile Gln Lys Glu Thr Thr Glu Gln Arg Ala 200 205 210 Leu Glu Glu Gln Glu Lys Gln Glu Arg Ala Glu Arg Val Arg Lys 215 220 225 Leu Arg Glu Glu Arg Lys Gln Asn Glu Gln Glu Gln Ala Glu Leu 230 235 240 Asp Gly Thr Ser Gly Leu Ala Glu Leu Asp Pro Glu Pro Thr Asn 245 250 255 Pro Phe Gln Pro Ile Ala Ser Val Ile Ile Glu Asn Glu Gly Lys 260 265 270 Leu Glu Arg Glu Lys Lys Asn Gln Lys Met Glu Lys Asp Ser Asp 275 280 285 Gly Cys His Leu Lys His Lys Met Glu His Glu Gln Ile Glu Thr 290 295 300 Gln Gly Gln Val Asn His Asn Gly Gln Lys Asn Asn Glu Phe Gly 305 310 315 Leu Val Glu Asn Tyr Lys Glu Ala Leu Thr Gln Gln Leu Lys Asn 320 325 330 Glu Asp Glu Thr Asp Arg Pro Ser Leu Glu Ser Ala Asn Gly Lys 335 340 345 Lys Lys Thr Lys Lys Leu Arg Met Lys Arg Asn His Arg Val Glu 350 355 360 Pro Leu Asn Ile Asp Asp Cys Ala Pro Glu Ser Pro Thr Pro Pro 365 370 375 Pro Pro Pro Pro Pro Val Gly Trp Gly Thr Pro Lys Val Thr Arg 380 385 390 Leu Pro Lys Leu Glu Pro Leu Gly Glu Thr His His Asn Asp Phe 395 400 405 Tyr Arg Lys Pro Leu Pro Pro Leu Ala Val Pro Gln Arg Pro Asn 410 415 420 Ser Asp Ala His Asp Val Ile Ser 425 18 633 PRT Homo sapiens misc_feature Incyte ID No 3097848CD1 18 Met Glu Ser Gly Pro Lys Met Leu Ala Pro Val Cys Leu Val Glu 1 5 10 15 Asn Asn Asn Glu Gln Leu Leu Val Asn Gln Gln Ala Ile Gln Ile 20 25 30 Leu Glu Lys Ile Ser Gln Pro Val Val Val Val Ala Ile Val Gly 35 40 45 Leu Tyr Arg Thr Gly Lys Ser Tyr Leu Met Asn His Leu Ala Gly 50 55 60 Gln Asn His Gly Phe Pro Leu Gly Ser Thr Val Gln Ser Glu Thr 65 70 75 Lys Gly Ile Trp Met Trp Cys Val Pro His Pro Ser Lys Pro Asn 80 85 90 His Thr Leu Val Leu Leu Asp Thr Glu Gly Leu Gly Asp Val Glu 95 100 105 Lys Gly Asp Pro Lys Asn Asp Ser Trp Ile Phe Ala Leu Ala Val 110 115 120 Leu Leu Cys Ser Thr Phe Val Tyr Asn Ser Met Ser Thr Ile Asn 125 130 135 His Gln Ala Leu Glu Gln Leu His Tyr Val Thr Glu Leu Thr Glu 140 145 150 Leu Ile Lys Ala Lys Ser Ser Pro Arg Pro Asp Gly Ala Glu Asp 155 160 165 Ser Thr Glu Phe Val Ser Phe Phe Pro Asp Phe Leu Trp Thr Val 170 175 180 Arg Asp Phe Thr Leu Glu Leu Lys Leu Asn Gly His Pro Ile Thr 185 190 195 Glu Asp Glu Tyr Leu Glu Asn Ala Leu Lys Leu Ile Gln Gly Asn 200 205 210 Asn Pro Arg Val Gln Thr Ser Asn Phe Pro Arg Glu Cys Ile Arg 215 220 225 Arg Phe Phe Pro Lys Arg Lys Cys Phe Val Phe Asp Arg Pro Thr 230 235 240 Asn Asp Lys Asp Leu Leu Ala Asn Ile Glu Lys Val Ser Glu Lys 245 250 255 Gln Leu Asp Pro Lys Phe Gln Glu Gln Thr Asn Ile Phe Cys Ser 260 265 270 Tyr Ile Phe Thr His Ala Arg Thr Lys Thr Leu Arg Glu Gly Ile 275 280 285 Thr Val Thr Gly Asn Arg Leu Gly Thr Leu Ala Val Thr Tyr Val 290 295 300 Glu Ala Ile Asn Ser Gly Ala Val Pro Cys Leu Glu Asn Ala Val 305 310 315 Ile Thr Leu Ala Gln Arg Glu Asn Ser Ala Ala Val Gln Arg Ala 320 325 330 Ser Asp Tyr Tyr Ser Gln Gln Met Ala Gln Arg Val Lys Leu Pro 335 340 345 Thr Asp Thr Leu Gln Glu Leu Leu Asp Met His Ala Ala Cys Glu 350 355 360 Arg Glu Ala Ile Ala Ile Phe Met Glu His Ser Phe Lys Asp Glu 365 370 375 Asn Gln Glu Phe Gln Lys Lys Phe Met Glu Thr Thr Met Asn Lys 380 385 390 Lys Gly Asp Phe Leu Leu Gln Asn Glu Glu Ser Ser Val Gln Tyr 395 400 405 Cys Gln Ala Lys Leu Asn Glu Leu Ser Lys Gly Leu Met Glu Ser 410 415 420 Ile Ser Ala Gly Ser Phe Ser Val Pro Gly Gly His Lys Leu Tyr 425 430 435 Met Glu Thr Lys Glu Arg Ile Glu Gln Asp Tyr Trp Gln Val Pro 440 445 450 Arg Lys Gly Val Lys Ala Lys Glu Val Phe Gln Arg Phe Leu Glu 455 460 465 Ser Gln Met Val Ile Glu Glu Ser Ile Leu Gln Ser Asp Lys Ala 470 475 480 Leu Thr Asp Arg Glu Lys Ala Val Ala Val Asp Arg Ala Lys Lys 485 490 495 Glu Ala Ala Glu Lys Glu Gln Glu Leu Leu Lys Gln Lys Leu Gln 500 505 510 Glu Gln Gln Gln Gln Met Glu Ala Gln Val Lys Ser Arg Lys Glu 515 520 525 Asn Ile Ala Gln Leu Lys Glu Lys Leu Gln Met Glu Arg Glu His 530 535 540 Leu Leu Arg Glu Gln Ile Met Met Leu Glu His Thr Gln Lys Val 545 550 555 Gln Asn Asp Trp Leu His Glu Gly Phe Lys Lys Lys Tyr Glu Glu 560 565 570 Met Asn Ala Glu Ile Ser Gln Phe Lys Arg Met Ile Asp Thr Thr 575 580 585 Lys Asn Asp Asp Thr Pro Trp Ile Ala Arg Thr Leu Asp Asn Leu 590 595 600 Ala Asp Glu Leu Thr Ala Ile Leu Ser Ala Pro Ala Lys Leu Ile 605 610 615 Gly His Gly Val Lys Gly Val Ser Ser Leu Phe Lys Lys His Lys 620 625 630 Leu Pro Phe 19 1958 PRT Homo sapiens misc_feature Incyte ID No 2957789CD1 19 Met Met Ala Thr Arg Arg Thr Gly Leu Ser Glu Gly Asp Gly Asp 1 5 10 15 Lys Leu Lys Ala Cys Glu Val Ser Lys Asn Lys Asp Gly Lys Glu 20 25 30 Gln Ser Glu Thr Val Ser Leu Ser Glu Asp Glu Thr Phe Ser Trp 35 40 45 Pro Gly Pro Lys Thr Val Thr Leu Lys Arg Thr Ser Gln Gly Phe 50 55 60 Gly Phe Thr Leu Arg His Phe Ile Val Tyr Pro Pro Glu Ser Ala 65 70 75 Ile Gln Phe Ser Tyr Lys Asp Glu Glu Asn Gly Asn Arg Gly Gly 80 85 90 Lys Gln Arg Asn Arg Leu Glu Pro Met Asp Thr Ile Phe Val Lys 95 100 105 Gln Val Lys Glu Gly Gly Pro Ala Phe Glu Ala Gly Leu Cys Thr 110 115 120 Gly Asp Arg Ile Ile Lys Val Asn Gly Glu Ser Val Ile Gly Lys 125 130 135 Thr Tyr Ser Gln Val Ile Ala Leu Ile Gln Asn Ser Asp Thr Thr 140 145 150 Leu Glu Leu Ser Val Met Pro Lys Asp Glu Asp Ile Leu Gln Val 155 160 165 Leu Gln Phe Thr Lys Asp Val Thr Ala Leu Ala Tyr Ser Gln Asp 170 175 180 Ala Tyr Leu Lys Gly Asn Glu Ala Tyr Ser Gly Asn Ala Arg Asn 185 190 195 Ile Pro Glu Pro Pro Pro Ile Cys Tyr Pro Trp Leu Pro Ser Ala 200 205 210 Pro Ser Ala Met Ala Gln Pro Val Glu Ile Ser Pro Pro Asp Ser 215 220 225 Ser Leu Ser Lys Gln Gln Thr Ser Thr Pro Val Leu Thr Gln Pro 230 235 240 Gly Arg Ala Tyr Arg Met Glu Ile Gln Val Pro Pro Ser Pro Thr 245 250 255 Asp Val Ala Lys Ser Asn Thr Ala Val Cys Val Cys Asn Glu Ser 260 265 270 Val Arg Thr Val Ile Val Pro Ser Glu Lys Val Val Asp Leu Leu 275 280 285 Ser Asn Arg Asn Asn His Thr Gly Pro Ser His Arg Thr Glu Glu 290 295 300 Val Arg Tyr Gly Val Ser Glu Gln Thr Ser Leu Lys Thr Val Ser 305 310 315 Arg Thr Thr Ser Pro Pro Leu Ser Ile Pro Thr Thr His Leu Ile 320 325 330 His Gln Pro Ala Gly Ser Arg Ser Leu Glu Pro Ser Gly Ile Leu 335 340 345 Leu Lys Ser Gly Asn Tyr Ser Gly His Ser Asp Gly Ile Ser Ser 350 355 360 Ser Arg Ser Gln Ala Val Glu Ala Pro Ser Val Ser Val Asn His 365 370 375 Tyr Ser Pro Asn Ser His Gln His Ile Asp Trp Lys Asn Tyr Lys 380 385 390 Thr Tyr Lys Glu Tyr Ile Asp Asn Arg Arg Leu His Ile Gly Cys 395 400 405 Arg Thr Ile Gln Glu Arg Leu Asp Ser Leu Arg Ala Ala Ser Gln 410 415 420 Ser Thr Thr Asp Tyr Asn Gln Val Val Pro Asn Arg Thr Thr Leu 425 430 435 Gln Gly Arg Arg Arg Ser Thr Ser His Asp Arg Val Pro Gln Ser 440 445 450 Val Gln Ile Arg Gln Arg Ser Val Ser Gln Glu Arg Leu Glu Asp 455 460 465 Ser Val Leu Met Lys Tyr Cys Pro Arg Ser Ala Ser Gln Gly Ala 470 475 480 Leu Thr Ser Pro Ser Val Ser Phe Ser Asn His Arg Thr Arg Ser 485 490 495 Trp Asp Tyr Ile Glu Gly Gln Asp Glu Thr Leu Glu Asn Val Asn 500 505 510 Ser Gly Thr Pro Ile Pro Asp Ser Asn Gly Glu Lys Lys Gln Thr 515 520 525 Tyr Lys Trp Ser Gly Phe Thr Glu Gln Asp Asp Arg Arg Gly Ile 530 535 540 Cys Glu Arg Pro Arg Gln Gln Glu Ile His Lys Ser Phe Arg Gly 545 550 555 Ser Asn Phe Thr Val Ala Pro Ser Val Val Asn Ser Asp Asn Arg 560 565 570 Arg Met Ser Gly Arg Gly Val Gly Ser Val Ser Gln Phe Lys Lys 575 580 585 Ile Pro Pro Asp Leu Lys Thr Leu Gln Ser Asn Arg Asn Phe Gln 590 595 600 Thr Thr Cys Gly Met Ser Leu Pro Arg Gly Ile Ser Gln Asp Arg 605 610 615 Ser Pro Leu Val Lys Val Arg Ser Asn Ser Leu Lys Ala Pro Ser 620 625 630 Thr His Val Thr Lys Pro Ser Phe Ser Gln Lys Ser Phe Val Ser 635 640 645 Ile Lys Asp Gln Arg Pro Val Asn His Leu His Gln Asn Ser Leu 650 655 660 Leu Asn Gln Gln Thr Trp Val Arg Thr Asp Ser Ala Pro Asp Gln 665 670 675 Gln Val Glu Thr Gly Lys Ser Pro Ser Leu Ser Gly Ala Ser Ala 680 685 690 Lys Pro Ala Pro Gln Ser Ser Glu Asn Ala Gly Thr Ser Asp Leu 695 700 705 Glu Leu Pro Val Ser Gln Arg Asn Gln Asp Leu Ser Leu Gln Glu 710 715 720 Ala Glu Thr Glu Gln Ser Asp Thr Leu Asp Asn Lys Glu Ala Val 725 730 735 Ile Leu Arg Glu Lys Pro Pro Ser Gly Arg Gln Thr Pro Gln Pro 740 745 750 Leu Arg His Gln Ser Tyr Ile Leu Ala Val Asn Asp Gln Glu Thr 755 760 765 Gly Ser Asp Thr Thr Cys Trp Leu Pro Asn Asp Ala Arg Arg Glu 770 775 780 Val His Ile Lys Arg Met Glu Glu Arg Lys Ala Ser Ser Thr Ser 785 790 795 Pro Pro Gly Asp Ser Leu Ala Ser Ile Pro Phe Ile Asp Glu Pro 800 805 810 Thr Ser Pro Ser Ile Asp His Asp Ile Ala His Ile Pro Ala Ser 815 820 825 Ala Val Ile Ser Ala Ser Thr Ser Gln Val Pro Ser Ile Ala Thr 830 835 840 Val Pro Pro Cys Leu Thr Thr Ser Ala Pro Leu Ile Arg Arg Gln 845 850 855 Leu Ser His Asp His Glu Ser Val Gly Pro Pro Ser Leu Asp Ala 860 865 870 Gln Pro Asn Ser Lys Thr Glu Arg Ser Lys Ser Tyr Asp Glu Gly 875 880 885 Leu Asp Asp Tyr Arg Glu Asp Ala Lys Leu Ser Phe Lys His Val 890 895 900 Ser Ser Leu Lys Gly Ile Lys Ile Ala Asp Ser Gln Lys Ser Ser 905 910 915 Glu Asp Ser Gly Ser Arg Lys Asp Ser Ser Ser Glu Val Phe Ser 920 925 930 Asp Ala Ala Lys Glu Gly Trp Leu His Phe Arg Pro Leu Val Thr 935 940 945 Asp Lys Gly Lys Arg Val Gly Gly Ser Ile Arg Pro Trp Lys Gln 950 955 960 Met Tyr Val Val Leu Arg Gly His Ser Leu Tyr Leu Tyr Lys Asp 965 970 975 Lys Arg Glu Gln Thr Thr Pro Ser Glu Glu Glu Gln Pro Ile Ser 980 985 990 Val Asn Ala Cys Leu Ile Asp Ile Ser Tyr Ser Glu Thr Lys Arg 995 1000 1005 Lys Asn Val Phe Arg Leu Thr Thr Ser Asp Cys Glu Cys Leu Phe 1010 1015 1020 Gln Ala Glu Asp Arg Asp Asp Met Leu Ala Trp Ile Lys Thr Ile 1025 1030 1035 Gln Glu Ser Ser Asn Leu Asn Glu Glu Asp Thr Gly Val Thr Asn 1040 1045 1050 Arg Asp Leu Ile Ser Arg Arg Ile Lys Glu Tyr Asn Asn Leu Met 1055 1060 1065 Ser Lys Ala Glu Gln Leu Pro Lys Thr Pro Arg Gln Ser Leu Ser 1070 1075 1080 Ile Arg Gln Thr Leu Leu Gly Ala Lys Ser Glu Pro Lys Thr Gln 1085 1090 1095 Ser Pro His Ser Pro Lys Glu Glu Ser Glu Arg Lys Leu Leu Ser 1100 1105 1110 Lys Asp Asp Thr Ser Pro Pro Lys Asp Lys Gly Thr Trp Arg Lys 1115 1120 1125 Gly Ile Pro Ser Ile Met Arg Lys Thr Phe Glu Lys Lys Pro Thr 1130 1135 1140 Ala Thr Gly Thr Phe Gly Val Arg Leu Asp Asp Cys Pro Pro Ala 1145 1150 1155 His Thr Asn Arg Tyr Ile Pro Leu Ile Val Asp Ile Cys Cys Lys 1160 1165 1170 Leu Val Glu Glu Arg Gly Leu Glu Tyr Thr Gly Ile Tyr Arg Val 1175 1180 1185 Pro Gly Asn Asn Ala Ala Ile Ser Ser Met Gln Glu Glu Leu Asn 1190 1195 1200 Lys Gly Met Ala Asp Ile Asp Ile Gln Asp Asp Lys Trp Arg Asp 1205 1210 1215 Leu Asn Val Ile Ser Ser Leu Leu Lys Ser Phe Phe Arg Lys Leu 1220 1225 1230 Pro Glu Pro Leu Phe Thr Asn Asp Lys Tyr Ala Asp Phe Ile Glu 1235 1240 1245 Ala Asn Arg Lys Glu Asp Pro Leu Asp Arg Leu Lys Thr Leu Lys 1250 1255 1260 Arg Leu Ile His Asp Leu Pro Glu His His Tyr Glu Thr Leu Lys 1265 1270 1275 Phe Leu Ser Ala His Leu Lys Thr Val Ala Glu Asn Ser Glu Lys 1280 1285 1290 Asn Lys Met Glu Pro Arg Asn Leu Ala Ile Val Phe Gly Pro Thr 1295 1300 1305 Leu Val Arg Thr Ser Glu Asp Asn Met Thr His Met Val Thr His 1310 1315 1320 Met Pro Asp Gln Tyr Lys Ile Val Glu Thr Leu Ile Gln His His 1325 1330 1335 Asp Trp Phe Phe Thr Glu Glu Gly Ala Glu Glu Pro Leu Thr Thr 1340 1345 1350 Val Gln Glu Glu Ser Thr Val Asp Ser Gln Pro Val Pro Asn Ile 1355 1360 1365 Asp His Leu Leu Thr Asn Ile Gly Arg Thr Gly Val Ser Pro Gly 1370 1375 1380 Asp Val Ser Asp Ser Ala Thr Ser Asp Ser Thr Lys Ser Lys Gly 1385 1390 1395 Ser Trp Gly Ser Gly Lys Asp Gln Tyr Ser Arg Glu Leu Leu Val 1400 1405 1410 Ser Ser Ile Phe Ala Ala Ala Ser Arg Lys Arg Lys Lys Pro Lys 1415 1420 1425 Glu Lys Ala Gln Pro Ser Ser Ser Glu Asp Glu Leu Asp Asn Val 1430 1435 1440 Phe Phe Lys Lys Glu Asn Val Glu Gln Cys His Asn Asp Thr Lys 1445 1450 1455 Glu Glu Ser Lys Lys Glu Ser Glu Thr Leu Gly Arg Lys Gln Lys 1460 1465 1470 Ile Ile Ile Ala Lys Glu Asn Ser Thr Arg Lys Asp Pro Ser Thr 1475 1480 1485 Thr Lys Asp Glu Lys Ile Ser Leu Gly Lys Glu Ser Thr Pro Ser 1490 1495 1500 Glu Glu Pro Ser Pro Pro His Asn Ser Lys His Asn Lys Ser Pro 1505 1510 1515 Thr Leu Ser Cys Arg Phe Ala Ile Leu Lys Glu Ser Pro Arg Ser 1520 1525 1530 Leu Leu Ala Gln Lys Ser Ser His Leu Glu Glu Thr Gly Ser Asp 1535 1540 1545 Ser Gly Thr Leu Leu Ser Thr Ser Ser Gln Ala Ser Leu Ala Arg 1550 1555 1560 Phe Ser Met Lys Lys Ser Thr Ser Pro Glu Thr Lys His Ser Glu 1565 1570 1575 Phe Leu Ala Asn Val Ser Thr Ile Thr Ser Asp Tyr Ser Thr Thr 1580 1585 1590 Ser Ser Ala Thr Tyr Leu Thr Ser Leu Asp Ser Ser Arg Leu Ser 1595 1600 1605 Pro Glu Val Gln Ser Val Ala Glu Ser Lys Gly Asp Glu Ala Asp 1610 1615 1620 Asp Glu Arg Ser Glu Leu Ile Ser Glu Gly Arg Pro Val Glu Thr 1625 1630 1635 Asp Ser Glu Ser Glu Phe Pro Val Phe Pro Thr Ala Leu Thr Ser 1640 1645 1650 Glu Arg Leu Phe Arg Gly Lys Leu Gln Glu Val Thr Lys Ser Ser 1655 1660 1665 Arg Arg Asn Ser Glu Gly Ser Glu Leu Ser Cys Thr Glu Gly Ser 1670 1675 1680 Leu Thr Ser Ser Leu Asp Ser Arg Arg Gln Leu Phe Ser Ser His 1685 1690 1695 Lys Leu Ile Glu Cys Asp Thr Leu Ser Arg Lys Lys Ser Ala Arg 1700 1705 1710 Phe Lys Ser Asp Ser Gly Ser Leu Gly Asp Ala Lys Asn Glu Lys 1715 1720 1725 Glu Ala Pro Ser Leu Thr Lys Val Phe Asp Val Met Lys Lys Gly 1730 1735 1740 Lys Ser Thr Gly Ser Leu Leu Thr Pro Thr Arg Gly Glu Ser Glu 1745 1750 1755 Lys Gln Glu Pro Thr Trp Lys Thr Lys Ile Ala Asp Arg Leu Lys 1760 1765 1770 Leu Arg Pro Arg Ala Pro Ala Asp Asp Met Phe Gly Val Gly Asn 1775 1780 1785 His Lys Val Asn Ala Glu Thr Ala Lys Arg Lys Ser Ile Arg Arg 1790 1795 1800 Arg His Thr Leu Gly Gly His Arg Asp Ala Thr Glu Ile Ser Val 1805 1810 1815 Leu Asn Phe Trp Lys Val His Glu Gln Ser Gly Glu Arg Glu Ser 1820 1825 1830 Glu Leu Ser Ala Val Asn Arg Leu Lys Pro Lys Cys Ser Ala Gln 1835 1840 1845 Asp Leu Ser Ile Ser Asp Trp Leu Ala Arg Glu Arg Leu Arg Thr 1850 1855 1860 Ser Thr Ser Asp Leu Ser Arg Gly Glu Ile Gly Asp Pro Gln Thr 1865 1870 1875 Glu Asn Pro Ser Thr Arg Glu Ile Ala Thr Thr Asp Thr Pro Leu 1880 1885 1890 Ser Leu His Cys Asn Thr Gly Ser Ser Ser Ser Thr Leu Ala Ser 1895 1900 1905 Thr Asn Arg Pro Leu Leu Ser Ile Pro Pro Gln Ser Pro Asp Gln 1910 1915 1920 Ile Asn Gly Glu Ser Phe Gln Asn Val Ser Lys Asn Ala Ser Ser 1925 1930 1935 Ala Ala Asn Ala Gln Pro His Lys Leu Ser Glu Thr Pro Gly Ser 1940 1945 1950 Lys Ala Glu Phe His Pro Cys Leu 1955 20 63 PRT Homo sapiens misc_feature Incyte ID No 5922849CD1 20 Met Ser Asn Asn Met Ala Lys Ile Ala Glu Ala Arg Lys Thr Val 1 5 10 15 Glu Gln Leu Lys Leu Glu Val Ser Gln Ala Ala Ala Glu Leu Leu 20 25 30 Ala Phe Cys Glu Thr His Ala Lys Asp Asp Pro Leu Val Thr Pro 35 40 45 Val Pro Ala Ala Glu Asn Pro Phe Arg Asp Lys Arg Leu Phe Cys 50 55 60 Gly Leu Leu 21 344 PRT Homo sapiens misc_feature Incyte ID No 7472828CD1 21 Met Gly Ile Gly Gly Arg Ala Pro Cys Phe Gln Thr Ala Tyr Leu 1 5 10 15 Ser Gly Pro Val Ala Ala Leu Gly Met Pro Val Met Ser Gly Ser 20 25 30 Phe Glu Leu Lys His Trp Arg Ala Gly Gly Leu Leu Tyr Trp Gly 35 40 45 Lys Val Gly Arg Glu Glu Val Arg Pro Ala Cys Leu Ser Arg Leu 50 55 60 Pro Arg Leu Glu Arg Arg Cys Ser Gln Pro Phe Leu Pro Ala Cys 65 70 75 Pro Arg Thr Trp Arg His Lys Ala Gln Glu Glu Asp Glu Glu Glu 80 85 90 Asn Lys Tyr Glu Leu Pro Pro Cys Glu Ala Leu Pro Leu Ser Leu 95 100 105 Ala Pro Ala His Leu Pro Gly Thr Glu Glu Asp Ser Leu Tyr Leu 110 115 120 Asp His Ser Gly Pro Leu Gly Pro Ser Lys Pro Ser Pro Pro Leu 125 130 135 Pro Gln Pro Thr Met Leu Lys Gly Ala Val Ser Leu Pro Val Ala 140 145 150 Gly Lys Gln Gly Pro Ile Phe Gly Arg Arg Glu Gln Gly Ala Ser 155 160 165 Ser Arg Val Val Pro Gly Pro Pro Lys Lys Pro Asp Glu Asp Leu 170 175 180 Tyr Leu Glu Cys Glu Pro Asp Pro Val Leu Ala Leu Thr Gln Thr 185 190 195 Leu Ser Phe Gln Val Leu Met Pro Ser Gly Pro Leu Pro Arg Thr 200 205 210 Ser Val Val Pro Arg Pro Thr Thr Ala Pro Gln Glu Thr Arg Asn 215 220 225 Gly Thr Ala Asp Ala Ala Ser Lys Glu Gly Arg Lys Ser Ser Leu 230 235 240 Pro Ser Val Ala Pro Thr Gly Ser Ala Ser Ala Ala Glu Asp Gly 245 250 255 Ala Tyr Thr Val Arg Pro Ser Ser Gly Pro His Gly Ser Gln Pro 260 265 270 Phe Thr Leu Ala Val Leu Leu Arg Gly Arg Val Phe Asn Ile Pro 275 280 285 Ile Arg Arg Leu Asp Gly Gly Arg His Tyr Ala Leu Gly Arg Glu 290 295 300 Gly Arg Asn Arg Glu Glu Leu Phe Ser Ser Val Ala Ala Met Val 305 310 315 Gln His Phe Met Trp His Pro Leu Pro Leu Val Asp Arg His Ser 320 325 330 Gly Ser Arg Glu Leu Thr Cys Leu Leu Phe Pro Thr Lys Pro 335 340 22 224 PRT Homo sapiens misc_feature Incyte ID No 8088595CD1 22 Met Pro Trp Arg Ala Pro Ser Ala Ser Ser Ala Ser Ala Gly Arg 1 5 10 15 Ile Leu Leu Arg Pro Thr Glu Glu Glu Gly Gly Ala Glu Arg Ser 20 25 30 Phe Ser Gly Pro Arg Gly Ser Ser Gly Arg Ile Pro Arg Phe Val 35 40 45 Ser Ile Ser Ile Thr Asn Gly Pro Val Phe Cys Gly Val Val Gly 50 55 60 Ala Val Ala Arg His Glu Tyr Thr Val Ile Gly Pro Lys Val Ser 65 70 75 Leu Ala Ala Arg Met Ile Thr Ala Tyr Pro Gly Leu Val Ser Cys 80 85 90 Asp Glu Val Thr Tyr Leu Arg Ser Met Leu Pro Ala Tyr Asn Phe 95 100 105 Lys Lys Leu Pro Glu Lys Met Met Lys Asn Ile Ser Asn Pro Gly 110 115 120 Lys Ile Tyr Glu Tyr Leu Gly His Arg Arg Cys Ile Met Phe Gly 125 130 135 Lys Arg His Leu Ala Arg Lys Arg Asn Lys Asn His Pro Leu Leu 140 145 150 Gly Val Leu Gly Ala Pro Cys Leu Ser Thr Asp Trp Glu Lys Glu 155 160 165 Leu Glu Ala Phe Gln Met Ala Gln Gln Gly Cys Leu His Gln Lys 170 175 180 Lys Gly Gln Ala Val Leu Tyr Glu Gly Gly Lys Gly Tyr Gly Lys 185 190 195 Ser Gln Leu Leu Ala Glu Ile Asn Phe Leu Ala Gln Lys Glu Gly 200 205 210 His Ser Tyr Pro Ser Gln Val Leu Trp Lys Pro Thr Leu Leu 215 220 23 309 PRT Homo sapiens misc_feature Incyte ID No 7488478CD1 23 Met Asn Leu Met Asp Ile Thr Lys Ile Phe Ser Leu Leu Gln Pro 1 5 10 15 Asp Lys Glu Glu Glu Asp Thr Asp Thr Glu Glu Lys Gln Ala Leu 20 25 30 Asn Gln Ala Val Tyr Asp Asn Asp Ser Tyr Thr Leu Asp Gln Leu 35 40 45 Leu Arg Gln Glu Arg Tyr Lys Arg Phe Ile Asn Ser Arg Ser Gly 50 55 60 Trp Gly Val Pro Gly Thr Pro Leu Arg Leu Ala Ala Ser Tyr Gly 65 70 75 His Leu Ser Cys Leu Gln Val Leu Leu Ala His Gly Ala Asp Val 80 85 90 Asp Ser Leu Asp Val Lys Ala Gln Thr Pro Leu Phe Thr Ala Val 95 100 105 Ser His Gly His Leu Asp Cys Val Arg Val Leu Leu Glu Ala Gly 110 115 120 Ala Ser Pro Gly Gly Ser Ile Tyr Asn Asn Cys Ser Pro Val Leu 125 130 135 Thr Ala Ala Arg Asp Gly Ala Val Ala Ile Leu Gln Glu Leu Leu 140 145 150 Asp His Gly Ala Glu Ala Asn Val Lys Ala Lys Leu Pro Val Trp 155 160 165 Ala Ser Asn Ile Ala Ser Cys Ser Gly Pro Leu Tyr Leu Ala Ala 170 175 180 Val Tyr Gly His Leu Asp Cys Phe Arg Leu Leu Leu Leu His Gly 185 190 195 Ala Asp Pro Asp Tyr Asn Cys Thr Asp Gln Gly Leu Leu Ala Arg 200 205 210 Val Pro Arg Pro Arg Thr Leu Leu Glu Ile Cys Leu His His Asn 215 220 225 Cys Glu Pro Glu Tyr Ile Gln Leu Leu Ile Asp Phe Gly Ala Asn 230 235 240 Ile Tyr Leu Pro Ser Leu Ser Leu Asp Leu Thr Ser Gln Asp Asp 245 250 255 Lys Gly Ile Ala Leu Leu Leu Gln Ala Arg Ala Thr Pro Arg Ser 260 265 270 Leu Leu Ser Gln Val Arg Leu Val Val Arg Arg Ala Leu Cys Gln 275 280 285 Ala Gly Gln Pro Gln Ala Ile Asn Gln Leu Asp Ile Pro Pro Met 290 295 300 Leu Ile Ser Tyr Leu Lys His Gln Leu 305 24 4184 DNA Homo sapiens misc_feature Incyte ID No 7461789CB1 24 atggcggacc ctctgaggag gacgctgtcc aggctccggg gaaggcgggg tccccgcggc 60 accggggggc tagggctccg ggcggcagcc gcagctgcgg tggcggcctc ttcggcagcc 120 gcgggagacg cctggggtgc cgcagacacc ctcccacgtg agcacgccgg gggacacggg 180 cggagcctac agcagccctc gccgtcacct gaggcgtggg gtcccggggc gcgggtcccc 240 ggagggcacc cggagcagtt gggggcgctc gggcctcggc cgcgcggcgg gcaggaggcc 300 gccccccaga gccatgggct cgctcacgcc cctccgcact ccccggaggg ctccgagggc 360 agcggagagg aggaggaaga cgacgaggac gaagacgact acgacgccga ctactacgaa 420 aacctgcccg gcggctcgca gtctgcgccc gagcctgagg gggcggaggc ggaacggcgt 480 cccccgcctc ccccagcggc gggctcctcc ctgggggcgg agggcggccg cctggagaca 540 ggcaggctgc ggacccagtt gcgagaggcc tattatctgc tgatccaggc catgcacgac 600 ctgccccctg actcgggcgc gcggcggggc ggcaggggct tggcggatca cagcttcccc 660 gcgggagccc gggctccggg ccagccgcct tcccgcggcg ccgcgtaccg ccgagcctgc 720 ccccgggacg gggagcgggg aggcggcgga cgccctcggc agcaggtgtc cccgccccgg 780 tcgcctcaga gggagccgcg gggaggccag ctgcggactc ctcggatgcg gccgtcctgc 840 agcagaagcc tcgagagcct ccgggtgggt gccaagccgc ctcccttcca gcggtggccg 900 agcgacagct ggatcaggct gcaaggacca cggctgctcc tcgggaagcc cttcagggat 960 ccagcggggt cctctgtgat acgcagtggc aaaggagacc gcccggaagg cccctccttc 1020 ctcaggccgc cggcagtgac agtcaagaag ctgcagaagt ggatgtacaa agggcgtctg 1080 ctgtccctgg gaatgaaggg tcgtgcccgt gggacggctc ccaaagtcac aggaacgcag 1140 gcagcctccc caaatgtggg cgctttgaaa gtgcgtgaaa accgtgtcct gtcggtgcct 1200 ccagaccaaa gaattacgct gacagattta tttgaaaatg cctatgggtc ttcaatgaag 1260 ggaagagaac ttgaagagct gaaggataat attgaattca gaggtcataa gccacttaac 1320 agcatcactg tttcaaagaa acgcaattgg ctatatcaga gtactctgag gcctcttaat 1380 ctggaagaag aaaataagaa atgccaagat agaagtcatt tatccatctc acctgtgtct 1440 ctacctaaac atcagctatc acagtctttc ctcaaatcat ctaaagagta ctgtacatat 1500 gtggtatgta acgctacaaa ctcttcatta tcgaaaaact gtgctttaga ttttaatgag 1560 gaaaatgatg cagatgatga aggagaaata tggtacaatc ccattcctga ggatgatgac 1620 cttggtatat caagtgcctt gagttttggt gaggccgact ctgctgttct gaagctccct 1680 gctgtcaatt tgagcatgtt gtctggcagt gacctgatga aagcagagcg gcatactgaa 1740 gactcactgt gctcttccga acatgcaggt gatattcaga ccacacggtc aaatggaatg 1800 aatcctatac atcctgccca ttccacagaa tttgtgcagc agtacaagca aaagctagga 1860 cacaagacac aagaaggtat aatggtggag gacagtccca tgttgaaatc tccttttgca 1920 ggttctggga tcctggctgc tacaaatagt actgaattgg gaattatgga accatcttct 1980 ccaaatccta gccctgtgaa aaaaggcagt tcaattaatt ggtcattgcc agataaaata 2040 aaatctccac gaactgtgag gaaactttcc atgaaaatga aaaagttgcc agaatttagc 2100 cgaaagctaa gcgttaaggg aacattgaat tatataaaca gtccagataa tactccttct 2160 ttgtctaaat ataactgccg agaaattcat catactgata ttctgccctc tgggaacaca 2220 accaccgctg ctaagaggaa tgttataagc cgataccatc ttgataccag tgtatcctcc 2280 cagcagagct accagaagaa aaactctatg agttctaagt attcctgcaa aggtggttac 2340 cttagtgatg gagactcacc tgaacttaca actaaagcta gcaaacatgg atctgaaaac 2400 aaatttggaa aaggaaaaga aataatttca aatagttgta gcaagaatga aatagacatt 2460 gatgctttta ggcattatag cttttctgat caacctaagt gttcacagta catatctggg 2520 ctcatgagtg tacatttcta tggtgctgag gatttaaaac cacctcggat agattcaaaa 2580 gacgtctttt gtgcaattca ggtagattca gtaaacaaag caagaacagc tttgctcaca 2640 tgccgaacaa catttttaga catggatcac actttcaaca tagaaattga aaatgcacaa 2700 catttgaaac tagtagtatt cagttgggaa cccactccaa gaaaaaatcg agtttgttgt 2760 catggaactg ttgttcttcc caccttattt agagtgacaa agactcatca gttggctgtc 2820 aaacttgaac ctagaggtct tatttatgtg aaagtgactc ttatggaaca gtgggagaat 2880 tctcttcatg gactagatat aaaccaagaa ccaataatat ttggagttga tattcaaaaa 2940 gttgtagaga aagaaaatat aggactgatg gtgccccttc tgatacagaa atgtattatg 3000 gaaattgaaa agagaggctg tcaggtagta ggcctgtatc gattatgtgg ttcggcagca 3060 gtcaagaaag aactgcgaga ggcttttgag agagatagca aagctgttgg tctgtgtgaa 3120 aaccagtacc cagatataaa tgtaataaca ggtgttctta aggattattt aagagaactc 3180 ccttctcctc tgataacaaa gcagctttat gaggctgtat tagatgcaat ggcaaaaagt 3240 cctttgaaaa tgtcatcaaa tggttgtgag aatgacccag gtgactctaa gtacactgtt 3300 gacctgctgg attgtctgcc agagattgag aaggcaaccc taaagatgtt gttggatcat 3360 ttgaaattgg tggcttccta tcatgaagtg aataagatga cgtgccagaa tttggctgtg 3420 tgctttggac cagtattatt aagtcagagg caagagcctt ccacccataa caacagagtc 3480 tttactgatt cagaagaact tgcaagtgct ttggatttta aaaaacacat tgaagttctt 3540 cattacttac tccaactctg gccagtgcag cgtttaactg tcaaaaaatc aacagacaat 3600 ttattcccag agcagaagtc ttctctgaat tatttgaggc agaagaaaga acgacctcat 3660 atgttaaatt tgagtggtac tgattcatca ggagtactta ggccaaggca aaaccgatta 3720 gacagtccac ttagcaatcg ttatgcagga gactggagca gctgtgggga aaactacttt 3780 ttaaatacaa aagaaaattt aaatgatgtg gattatgatg atgtcccttc agaagataga 3840 aaaatcggag aaaattatag caaaatggat gggccagaag taatgattga acagccaatt 3900 cccatgtcca aagagtgtac atttcagaca tatttgacaa tgcagacaat tgagtctaca 3960 gtggatcgaa aaaacaatct caaagatcta caagaaagta ttgatacttt gattggaaat 4020 ctggaacgtg agctcaacaa aaacaagctt aatatgagtt tttgagattg agggtttttt 4080 tttgtaatgt cttaaagttt caaatctgtt tttgttttat tttcttatac cacccacagc 4140 gatgttttca tttttttact cttgaaatct ttaattaatt ttta 4184 25 1114 DNA Homo sapiens misc_feature Incyte ID No 1210450CB1 25 gggcaggttg cggtccggtc ctggcgcccg cgcagaacca gctgtctgag ctgcccgggc 60 agcgggggag cagcgagcgg gcttccgcga gccggagaag gcacaggcct gtcccgggtc 120 ccggcaggtc tgcgcgtctg ttcccagcgc tctgcgaggc ctaaaaagga ggagcaacct 180 gtccagaatc cctgcaggac aggaaaagga ggggaaatct cgacatggaa aaaccctaca 240 ataaaaatga aggaaacctg gaaaacgagg gaaagccaga agatgaagta gagcctgatg 300 atgaaggaaa gtcagacgag gaagaaaagc cagacgtgga ggggaagaca gaatgcgagg 360 gaaagagaga ggatgaggga gagccaggtg atgagggaca actggaagat gagggaagcc 420 aggaaaagca gggcaggtcc gaaggtgagg gcaagccaca aggcgagggc aagccagcct 480 cccaggcaaa gccagagagc cagccgcggg ccgccgaaaa gcgcccggct gaagattatg 540 tgccccggaa agcaaaaaga aaaacggaca gggggacgga cgattccccc aaggactctc 600 aggaggactt acaggaaagg catctgagca gtgaggagat gatgagagaa tgtggagatg 660 tgtcaagggc tcaagaggag ctaaggaaaa aacagaaaat gggtggtttt cattggatgc 720 aaagagatgt acaggatcca ttcgccccaa ggggacaacg gggtgtcagg ggagtgaggg 780 gtggaggtag gggccagaaa gacttagaag atgtcccata tgtttaatgt ctttggcctt 840 taattctgat ttctctgatg ggaatattgc cagtcctgct tttcctggta ggtatttgcc 900 ggcctaagtg ctttaacctt aagctgatac tttcctttag gtgtcactct tgttaccagc 960 agacttttga cccaactaca gtgctctgtc ttttagtaga ggattttcac ccatgtgcat 1020 ggaataaatg ttcatggtac attgtaaaat aacaataaaa aagagttttc agaaccatga 1080 aaaaaaaaaa aaaaaaaaaa aaaaaatggc ggtc 1114 26 1625 DNA Homo sapiens misc_feature Incyte ID No 427539CB1 26 ctcgacagcg gcaagtttgg gagttgcacg agtttgcggg gcgggggaca ggccaggagg 60 gtggccatgg aggaggagcg ggggtcggcg ctggcggccg agtcggcgct ggagaagaac 120 gtggccgagc tgaccgtcat ggacgtgtac gacatcgcgt cgcttgtggg ccacgagttc 180 gagcgggtca ttgaccagca cggctgcgag gccatcgcgc gcctcatgcc caaggtcgtg 240 cgcgtcctgg agatcctgga ggtgctggtc agccgccacc acgtcgcgcc cgagctggac 300 gagctgcgcc tggagctgga ccgcctgcgc ctggagagga tggaccgcat cgagaaggag 360 cgcaagcacc agaaggagct ggagctggtg gaggatgtgt ggcgagggga ggcgcaggac 420 ctcctctccc agatcgccca gctgcaggag gagaacaagc agctcatgac caacctctcc 480 cacaaggatg tcaacttctc agaggaggag ttccagaagc atgaaggcat gtcagagcgg 540 gagcgacagg tgatgaagaa gctgaaggag gtggtggaca aacaacgcga cgagatccgc 600 gccaaggaca gggagctggg cctgaaaaat gaggacgttg aggctttaca gcagcagcag 660 acacggctga tgaagatcaa ccatgacctt cggcaccggg tcacggtggt ggaggcccag 720 gggaaagccc tgatcgaaca gaaggtggag ctggaggcag acctgcagac caaggagcag 780 gagatgggca gcctgcgagc agagctgggg aagttgcgag agaggctgca gggggagcac 840 agccagaatg gggaggagga gcctgagacg gagccggtgg gagaggagag catctccgac 900 gcagagaagg tggccatgga tctcaaggac cccaaccgcc cccggttcac cctgcaggag 960 ctgcgggacg tgctgcacga gaggaacgag ctcaagtcca aggtgttctt gctgcaggag 1020 gagctggctt actataagag tgaagaaatg gaagaggaaa accgaatacc ccaaccccca 1080 cccatcgccc acccgaggac gtccccccag ccggagtcgg gcatcaagcg actgtttagc 1140 ttcttctccc gagataagaa gcgcctggcc aacacacaga gaaacgtgca catccaggag 1200 tcctttggac agtgggcaaa cacccaccgc gatgacggtt acacagagca aggacaggaa 1260 gccctgcagc atctgtgacc ttggcccatc tccaccctcc aacctggact gcccgccacc 1320 agcgcctgca accgaactgc agcccagggg tcattgctgc ctcaagcctc tcggtgcaga 1380 tgcaccctga aaactgaccc ctcaaacaga ctgtctgatt tgaggatgga cattgaaaaa 1440 ctgacgccaa actctaaaga aatgtttatt tatacccagg gctatcactg tttctaatag 1500 atgactctga tcccgtagga tatatattta ataatcccac aaacggaggc cagacttctg 1560 cgttaacttc agtaacacaa gcttctttaa gccaaataca tcacttgcca ctaaaaaaaa 1620 aaaaa 1625 27 4713 DNA Homo sapiens misc_feature Incyte ID No 1545043CB1 27 gcggccgctg cagccagccc cgcggctccc tcagacccgc gggcgcagcc gccgggggtg 60 aggcgcttgg ggaccgcggg ccgagcggcg gggatccccg agcaccatgc tggacccgtc 120 ttccagcgaa gaggagtcgg acgaggggct ggaagaggaa agccgcgatg tgctggtggc 180 agccggcagc tcgcagcgag ctcctccagc cccgactcgg gaagggcggc gggacgcgcc 240 ggggcgcgcg ggcggcggcg gcgcggccag atctgtgagc ccgagcccct ctgtgctcag 300 cgaggggcga gacgagcccc agcggcagct ggacgatgag caggagcgga ggatccgcct 360 gcagctctac gtcttcgtcg tgaggtgcat cgcgtacccc ttcaacgcca agcagcccac 420 cgacatggcc cggaggcagc agaagcttaa caaacaacag ttgcagttac tgaaagaacg 480 gttccaggcc ttcctcaatg gggaaaccca aattgtagct gacgaagcat tttgcaacgc 540 agttcggagt tattatgagg tttttctaaa gagtgaccga gtggccagaa tggtacagag 600 tggagggtgt tctgctaatg acttcagaga agtatttaag aaaaacatag aaaaacgtgt 660 gcggagtttg ccagaaatag atggcttgag caaagagaca gtgttgagct catggatagc 720 caaatatgat gccatttaca gaggtgaaga ggacttgtgc aaacagccaa atagaatggc 780 cctaagtgca gtgtctgaac ttattctgag caaggaacaa ctctatgaaa tgtttcagca 840 gattctgggt attaaaaaac tagaacacca gctcctttat aatgcatgtc agctggataa 900 cgcagatgaa caagcagccc agatcagaag ggaacttgat ggccggctgc aattggcaga 960 taaaatggca aaggaaagaa aattccccaa atttatagca aaagatatgg agaatatgta 1020 tatagaagag ttgcggtctt cagtgaattt gctaatggcc aatttggaaa gtcttccagt 1080 ttcgaaaggt ggtccggaat ttaaattaca aaaattaaaa cgttcacaga actctgcatt 1140 tttggacata ggagatgaga atgagattca gctgtcaaag tccgacgtgg tactgtcatt 1200 caccttagag attgtcataa tggaagtgca aggcctgaag tcagttgctc ccaatcgaat 1260 tgtttactgt acaatggaag tggaaggaga aaaacttcag acagaccagg ccgaagcctc 1320 aaggccacaa tgggggactc aaggagattt caccaccacc catcctcggc ctgtggtcaa 1380 agtgaaactc ttcacagaaa gcactggagt tctggccctg gaagataaag aactgggaag 1440 ggtgatatta tacccaactt ctaatagctc caaatcagct gaattacacc gaatggtagt 1500 tccaaaaaat agccaggatt ctgacttaaa aatcaaactg gcagtgcgaa tggataaacc 1560 agcacatatg aagcatagtg gatatctgta tgcccttgga cagaaggttt ggaaaagatg 1620 gaaaaaacgt tactttgttc tagttcaggt tagccaatat acctttgcta tgtgcagtta 1680 tagagaaaag aagtctgaac cacaagaatt aatgcagctt gaaggctata ctgtggatta 1740 taccgatccc cacccaggcc ttcagggtgg ttgtatgttc tttaatgctg ttaaagaagg 1800 agatactgta atctttgcca gtgatgatga acaggacaga atattatggg ttcaagccat 1860 gtatagggcc acaggtcaat catataaacc agttcctgca attcaaaccc agaaactgaa 1920 tcctaaagga ggaactctcc atgcagatgc tcagctttat gcagatcgtt ttcagaaaca 1980 tggtatggat gagtttattt ctgcaaaccc ctgcaagctt gatcatgcct tcctttttag 2040 aatactccag aggcagactt tggatcacag actgaatgat tcctattctt gcttgggatg 2100 gtttagccct ggccaagtct ttgtgttaga tgagtactgt gcccgttatg gtgtgagagg 2160 ctgtcacaga catctctgct accttgcaga actgatggaa cattcagaaa atggtgctgt 2220 cattgaccct accctgctcc attacagctt tgcattctgt gcctctcatg tgcacggcaa 2280 caggcctgat ggaattggga ctgtttcagt ggaagaaaaa gaaagatttg aggagataaa 2340 agagagactc tcttcccttt tagaaaatca gataagccat ttcagatact gttttccctt 2400 tggacgacct gaaggtgctc taaaagctac actttcatta cttgaaaggg ttttaatgaa 2460 agatattgcc actcccatac cagcagaaga ggtgaagaaa gtggtcagaa aatgtctcga 2520 gaaagctgcc ttgatcaatt acactagact cacagaatat gccaaaatag aagagaccat 2580 gaaccaggca tctcctgcta gaaagctgga agagattctt catctggcag agctctgcat 2640 agaagtctta cagcagaatg aagagcatca tgcagaggca tttgcctggt ggcctgattt 2700 attggctgaa catgcagaga aattttgggc tttatttaca gtggatatgg acactgcact 2760 agaggctcaa ccgcaagact cctgggatag ttttcctctt ttccaactgc ttaataattt 2820 cctccgaaat gacacacttt tgtgtaatgg aaaatttcac aaacacttgc aagaaatctt 2880 tgtacccttg gttgtccgct atgtggatct catggagtct tccatcgccc agtcaattca 2940 cagaggtttt gagcaggaga catggcagcc tgtcaacaat ggctcagcaa catcagaaga 3000 ccttttttgg aagcttgatg cactgcaaat gtttgtcttt gatctgcact ggccagaaca 3060 ggaatttgcc caccacttag agcaaagact taaactaatg gccagtgata tgctagaggc 3120 ctgtgtcaaa agaacaagaa ctgcatttga actcaagcta caaaaggcaa gcaaaacaac 3180 tgacttgcgc attccagctt ccgtttgcac tatgtttaat gtattagtcg atgccaaaaa 3240 gcaaagcacc aaactctgtg ccctggatgg aggacaagag tttggtagtc aatggcaaca 3300 gtaccattca aaaatagatg atctgatcga caacagtgta aaagaaatca tttcactgtt 3360 agtttcaaag tttgtttcag tgttggaagg cgtgttgtct aagctgtcaa ggtatgatga 3420 aggcactttc ttttcatcca ttctgtcatt cactgtgaaa gcagctgcaa aatatgttga 3480 tgttccaaaa ccaggaatgg atctggcaga cacctatatt atgtttgttc ggcaaaacca 3540 agatattctt cgagaaaagg tcaatgagga aatgtatata gaaaagttat ttgatcaatg 3600 gtacagcagt tccatgaaag tcatttgcgt gtggttgact gatagattag acctccaact 3660 ccatatttac cagctgaaga cgctcatcaa gattgtgaag aaaacctaca gggactttcg 3720 attgcagggt gtgttggaag gaacactgaa cagtaagact tatgatactg tgcacagacg 3780 tttaacagta gaggaggcca cagcctctgt ttcagaagga ggaggacttc agggcattac 3840 tatgaaagac agtgacgaag aagaagaagg ctgatatcac acagctttgc agaaggaagg 3900 aagaccttga tcgacattgt tttttatttt tttaaccttg tccttgtaat tacattcatt 3960 gtttgttttg gccaaataaa aatgcttgta tttctttaaa aagtaagcct gaatgtagag 4020 taaaagggga aatgccaaga ttttggggtt tttttgtttc ctttttttgt ttgtttgttt 4080 gtttgttttt ttggagaaga gcatcctctt ttgtgtagtt tgacctaaaa atgaaccttg 4140 gctctgcttg tgatcagaac atgaactttt ttttttaaag aagatttgag catttttctg 4200 taatcacatc aaaatgatgt tttctgtgta aagcgagata catatttctc ataatgcagc 4260 attgtgagaa gtcagttcgg accactgcac caacactgtc gtatccttgt taaaatggtg 4320 tgtaccttac aaattataat ttatgtgcca ggttcgtttt gtacttaatt tgctattatt 4380 gtgatgtgta taaaatcttt aatcttggtt cttagtactt tgaattggtc tacaggtata 4440 ttcctgggat gaaaggattg ccaaacccaa atatagacta gattatccaa tgggtttgtg 4500 tctttgttcc attctcaaca tttcttcttt caactataag taatccccag gtgtggggta 4560 gcaagtgtgc ttccgtcaag ataccatatt ctcctgctcc agtataacag cttgcaggca 4620 ataaaaatct atttgctcat aactacttct gtatttatta gacttatata gagcaaatgc 4680 agtaaaagag gtttgcagtg tttcaaacat ccc 4713 28 2571 DNA Homo sapiens misc_feature Incyte ID No 7488231CB1 28 cgccgcgctc cgagtcccat tcccgagctg ccgctgttgt cgctcgctca gcgtctccct 60 ctcggccgcc ctctcctcgg gacgatggcg cgcggtggcc gcggccgccg cctggggtta 120 gccctggggc tgctgctggc gctggtgctg gcgccgcggg ttctgcgggc caagcccacg 180 gtgcgcaaag agcgcgtggt gcggcccgac tcggagctgg gcgagcggcc ccctgaggac 240 aaccagagct tccagtacga ccacgaggcc ttcctgggca aggaggactc caagaccttc 300 gaccagctca ccccggacga gagcaaggag gactccaaga ccttcgacca gctcaccccg 360 gacgagagca aggagaggct agggaagatt gttgatcgaa tcgacaatga tggggatggc 420 tttgtcacta ctgaggagct gaaaacctgg atcaaacggg tgcagaaaag atacatcttt 480 gataatgtcg ccaaagtctg gaaggcatat gatagggaca aggatgataa aatttcctgg 540 gaagaataca aacaagccac ctatggttac tacctaggaa accccgcaga gtttcatgat 600 tcttcagatc atcacacctt taaaaagatg ctgccacgtg atgagagaag attcaaagct 660 gcagacctca atggtgacct gacagctact cgggaggagt tcactgcctt tctgcatcct 720 gaagagtttg aacatatgaa ggaaattgtg gttttggaaa ccctggagga catcgacaag 780 aacggggatg ggtttgtgga tcaggatgag tatattgcgg atatgttttc ccatgaggag 840 aatggccctg agccagactg ggttttatca gaacgggagc agtttaacga attccgggat 900 ctgaacaagg acgggaagtt agacaaagat gagattcgcc actggatcct ccctcaagat 960 tatgatcatg cacaggctga ggccaggcat ctggtatatg aatcagacaa aaacaaggat 1020 gagaagctaa ctaaagagga aatattggag aactggaaca tgtttgtcgg aagccaagct 1080 accaattacg gggaagatct cacaaaaaat catgatgagc tttgatagac actcaccaga 1140 atatggcaga ctgtcatagg cattctgtta ttgtcttgga ttgttgctac aattgtctaa 1200 tttacagcag ttgtgatccc acaaaaagca agtttatacc tcagattggg gtataaaaat 1260 tgtttttcgc tcagtattta ctggaaaatg gacatcacta gtctttcagt aagatttctc 1320 tcaaaacacg tgaaaacctg gatcaaacgg gtgcagaaaa gatacatctt tgataatgtc 1380 gccaaagtct ggaaggatta tgatagggac aaggatgata aaatttcctg ggaagaatac 1440 aaacaagcca cctatggtta ctacctagga aaccccgcag agtttcatga ttcttcagat 1500 catcacacct ttaaaaagat gctgccacgt gatgagagaa gattcaaagc tgcagacctc 1560 aatggtgacc tgacagctac tcgagagcaa ggagaggcta gggaagattg ttgatcgaat 1620 cgacaatgat ggggatggct ttgtcactac tgaggagctg aaaacctgga tcaaacgggt 1680 gcagaaaaga tacatctttg ataatgtcgc caaagtctgg aaggattatg atagggacaa 1740 ggatgataaa atttcctggg aagaatacaa acaagccacc tatggttact acctaggaaa 1800 ccccgcagag tttcatgatt cttcagatca tcacaccttt aaaaagatgc tgccacgtga 1860 tgagagaaga ttcaaagctg cagacctcaa tggtgacctg acagctactc gggaggattt 1920 cactgccttt ctgcatcctg aagagtttga acatatgaag gaaatgtggt ttggaaaccc 1980 tggaggacat cgacaagaac ggggatgggt tgtggatcag gatgagtata tgcggatatg 2040 ttttcccatg aggagaatgg cctgagccag actgggtttt atcaagaacg ggagcagttt 2100 aacgaattcc gggatctgaa caaggacggg aagttagaca aagatgagat tccgcactgg 2160 atcctccctc acgattatga tcatgcacag gctgaggcca ggcatctggt atatgatcag 2220 accaaaaaca gatgagaagc taactacacg cggacataat ggagcaccga accgtggtgg 2280 cggaagccaa gccaacccat tcacggggaa agaaccccaa caaagatcac gacgagctcg 2340 gacgacagcc acacgaatat ggcgcatgga catagccatc cggcatgggc ggcactgggg 2400 accaagccac ctccagcgcg tgaacaacaa agcgcatacc tcatgggaaa aagggtgacc 2460 aataagcaag aaacacgccg aaaagccaaa cgaacgagga cccggcgagg cagcaagctg 2520 acaatgagac aacgacagga gcacagcaag agcaccgccc caacatgata t 2571 29 4953 DNA Homo sapiens misc_feature Incyte ID No 1910008CB1 29 gcgggaggaa ctccgtcgtc ctccgcgcgc gcctgcccgc ggctggggaa ccaagggcct 60 ggaggggaaa cagctccctg ctgcgttgtt ttgggctccg tgagccgaaa gggggagggg 120 agacgaagga gaagcaaaca ctgcgcagcg ggaccgtgcg cggcctccgc tcctgcgtgc 180 gtgcgctcgc tcgcgcgggc tcagtgctgg cgcgtgaggc ggaggcgggc ggcggcggcg 240 gcgcctgcgc ggtcggactc ggtcgacggt tcgcagggga ggaggcggcg ggaggcggag 300 gaggcggcgg cggcgatgga ggtgaagcgg ctgaaagtga ccgagctgcg gtcggagctg 360 cagcggcggg gcctggactc gcgcggcctc aaggtggatc tggcgcagcg gctgcaggag 420 gcgctggacg ccgagatgct cgaggacgag gccggcggcg gcggggccgg gcccggcggg 480 gcctgcaagg cggagcctcg gcctgtggcc gcgtcgggcg gcggcccggg cggggacgag 540 gaggaggacg aagaggagga ggaggaggac gaggaggcgc tgcttgagga cgaggacgag 600 gagccacccc ctgctcaagc cttgggtcag gccgcgcagc cgccgccgga gcccccggag 660 gcggcagcca tggaggccgc ggccgagcca gatgcttccg agaagccggc ggaggccacg 720 gccgggtcag gcggggtaaa tggtggcgaa gagcagggcc tcggcaagag ggaggaagac 780 gaacccgagg agcggagcgg ggacgagacg ccgggatccg aggtgccggg tgacaaggcc 840 gccgaggaac agggagatga ccaggatagt gaaaagtcaa aaccagcagg ctcagatggt 900 gagcggcggg gggtaaagag acagcgggat gagaaggatg aacatggccg agcttactat 960 gaattccgag aggaggctta ccacagccgc tcaaagtctc cactgcctcc tgaagaagag 1020 gcaaaagatg aggaggagga tcaaactctt gtgaacctgg acacgtatac ctcggatctg 1080 cattttcaag tgagcaaaga ccgctatgga gggcagccac ttttctcaga gaagttcccc 1140 accctttggt ctggggcaag gagtacttac ggagtgacaa agggaaaagt ctgctttgag 1200 gcaaaggtaa cccagaatct cccaatgaaa gaaggctgca cagaggtctc tctccttcga 1260 gttgggtggt ctgttgattt ttcccgtcca cagcttggtg aagatgaatt ctcttacggt 1320 ttcgatggac gaggactcaa ggcagaaaat ggacaatttg aggaatttgg ccagactttt 1380 ggggagaatg atgttattgg ctgctttgct aattttgaga ctgaagaagt agaactttcc 1440 ttctccaaga atggagaaga cctaggtgtg gcattctgga tcagcaagga ttccctggca 1500 gaccgggccc ttctacccca tgtcctctgc aaaaattgtg ttgtagaatt aaacttcggt 1560 cagaaggagg agcccttctt cccaccacca gaagagtttg tgttcattca tgctgtgcct 1620 gttgaggagc gtgtacgcac tgcagtccct cccaagacca tagaggaatg tgaggtgatt 1680 ctgatggtgg gactacccgg atctggaaag acccagtggg cactgaaata tgcaaaagaa 1740 aaccctgaga aaagatacaa tgtcctggga gctgagactg tgctcaatca aatgaggatg 1800 aagggtctcg aggagccaga gatggacccc aaaagccgag accttttagt tcagcaagcc 1860 tcccagtgcc ttagtaagct ggtccagatt gcttcccgga caaagaggaa ctttattctt 1920 gatcagtgta atgtgtacaa ttctggccaa cggcggaagc tattgctgtt caagaccttc 1980 tctcggaaag tggtggtggt tgtccctaat gaggaagatt ggaagaagag gctggagttg 2040 aggaaggaag tagagggaga tgatgtgcct gaatctataa tgctggagat gaaagccaac 2100 ttctctttgc ctgaaaaatg cgactatatg gatgaggtga catatgggga gctggagaag 2160 gaggaagctc agcccattgt cactaagtac aaggaggagg caaggaagct tctgcccccc 2220 tccgagaagc ggacaaatcg ccgaaacaac cgaaacaagc gtaaccggca gaaccgaagc 2280 cggggccaag gctatgtggg cgggcagcgc cgaggctacg acaaccgggc ctacgggcag 2340 cagtactggg ggcagcctgg aaacagaggg ggttaccgta atttctatga tcgatacagg 2400 ggagactatg atcgatttta cgggcgagat tatgagtaca acaaacaaca aaactgtaca 2460 ttttttttaa agtttgttga aaagaatatt gtcttattct ataaaacatt tcaaacctag 2520 ttagagattt gtaatcaaaa aacatttgcg cagaaagcag cacttagggc tgcctgttct 2580 ataccctaca gtcagacagg aaaagaactg aaaatggcac ccttctgaca ttctgaggca 2640 gctggactgg cagccaagta aaggagagtg atgaggtggt gtggggaggg tggggaggca 2700 gcgcgagggt gctctccaca gcaggtagga gcaggcagga gtggggacag agggagagct 2760 tgtgactggg acagtgaaaa taaacgattg gctcttatca atcacttgca ccaactaacc 2820 gtttgtattt tctttaccga cctttccctt ttgggatatg tggtctggtg ggctgggagg 2880 cttacagtgt cccaactctc cattttcctg tgcctttgtg ctgtttggct gaggcatgga 2940 gactgccagt gggtggttct attttacagg attcccaggc caagaaagcc tggggactgt 3000 ccttgatggc agagcagaat agcctgcccc ttctatagac ccattgatta attcctgtaa 3060 aatcttgggg agaatgggat tgcagccctc agcctaaaga ttgtgatttg ccagtctcta 3120 gtctctgtct tccaaggttg agagaggtgg gaggtctctg aaatcatccc tgttagagtg 3180 tctgtcctct tacagtgcga gagaaggaac gtttctcagg gtttcagctc acacgaaaac 3240 acagcaggat tcttttcact gcagcgggat atgtatatag gtgaatcctg tggtgtgggc 3300 tcacaggccc aggtgctgag aatagcaact agcaactatt ttctacagtg tggagggctc 3360 gtgccctgct ggtttttttg atacacaggt agagaatgaa tagaagaggg gagggtggtg 3420 ggggggggtg ggaggcagct cagtggggct gctgaacctt gggaagagtg tgagtgtgaa 3480 cgtgtgtaaa tgtgcgtgtg aaagggagca ccccttcccg acttctagaa acaaaattcc 3540 ttttggggcg ctttccctgt gtgtccccag agaggcctct agccaggagc cttcagttgg 3600 gatagtttca tttgtgactt taccaatacc ctcccagttc ttgatagaca gctgtaggtt 3660 gctgggttca agaatatggg tgggatatgg aatgctcttt caatgtctag cttcagtttt 3720 cattcatcct cctgctcagc actgtcagcc aagagcttac tcagcagaca ccacatactg 3780 cagcagttcc tagtgagaaa atctgtgcca ctagaaaatg cttcacttcc atttcctcac 3840 ctgggcagtt ctctgtttaa aattgtgggc tgatttggtc ttcctctcct cctcccactg 3900 ttactgccct gcagcccttg ttcaggtgta cagaccctta ttctggcctc tagtgtcctt 3960 gtctgtcatg acacaccctt ccgcccaaat acctctgacc ccaaggctgg aatggggctg 4020 gtaggagata agtttgctta ctcatagtca tgtcctttct cttggcacct gcttccctgc 4080 ggtgtcctca aatggatttc tgtgtggcag tggagtgatt gcatgaattt ttctgtaaca 4140 cattaacttt gtattattat taagggagtt tgagaaagct ttgcttataa tgtcaaggca 4200 aggaggtaaa aactggagcc caaagaaatt cccttagggc aagattatgt tataatagaa 4260 aattgaattt cctgaggcag tggctgccac ccccttttca gatgtttagt cctgcaaata 4320 gcatctttct tgtagtctgt gacatggatg gggatgctag ggcccttagg ggcaagggga 4380 ctaaactaaa tcaagttgag tttttttcca gcaggggtta ggggaggtac tcgctgttga 4440 tatttgacac tagaaagtaa tcttttttac aaaactgttt ttctaggtgg gtggaaagtg 4500 aaactgccac atccttgttg gtttagtcca agagatcatt tgcaacaaca gtagatgtcc 4560 gggttttgtt tctgtctttt tattatgaaa aactatgtta agggggaaaa tgtggattat 4620 ggtaaccaga ggaatcccta gccttgtttt ccttagaaga cttgtttagt gttttatcag 4680 acgtctgttg tagttgtaga caggaaagct tgtgagaaaa acaccacatg gagcctgtaa 4740 atgtttttgc acaacctgta aagcattctt ggaagtggcc agtaaaaagg ggttttacca 4800 tttaaaaaaa aaatgtaact gtgtcattgt ttacatctgt aactttttcc tcccctgttc 4860 tcattacacc attctggcga aaatgtaggc aaagtagctt ccagttttag aataaataac 4920 catttggatt gaaaaacaaa aaaaaaaaaa aaa 4953 30 2259 DNA Homo sapiens misc_feature Incyte ID No 5151459CB1 30 catttccaag aatgctaatg aactcgaact cggcaggctg ggtttctgcg cgggcgtgga 60 gctggagcgc atgcgctcgt ccgttaccat aacgaccaga agacgctgca gccactaggg 120 aggagagcaa agtaatcaga acctcccaag gatggataac aaaatttcgc cggaggccca 180 agtggcggag ctggaacttg acgccgtgat cggcttcaat ggacatgtgc ccactggtct 240 caaatgccat cctgaccagg agcatatgat ttatcctctt ggttgcacag tcctcattca 300 ggcaataaat actaaagagc agaacttcct acagggtcat ggcaacaacg tctcctgctt 360 ggccatctcc aggtctggag agtacatcgc ctccggacaa gtcacattca tggggttcaa 420 ggcagacatc attttgtggg attataagaa cagagagctg cttgctcggc tgtcccttca 480 caaaggcaaa attgaagctc tggccttttc tccaaatgat ttgtacttgg tatcactagg 540 aggcccagat gacggaagtg tggtggtgtg gagcatagcc aagagagatg ccatctgtgg 600 cagccctgca gccggcctca atgttggcaa tgccaccaat gtgatcttct ccaggtgccg 660 ggatgagatg tttatgactg ctggaaatgg gacaattcga gtatgggaat tggatcttcc 720 aaatagaaaa atctggccaa ctgagtgcca aacaggacag ttgaaaagaa tagtcatgag 780 tattggagtg gatgatgatg atagcttttt ctaccttggc accacgactg gagatattct 840 aaaaatgaac cccaggacta aactgctgac agatgttggg cctgcgaagg acaaattcag 900 tttgggagtg tcagctatca ggtgcctgaa gatggggggt ttgttggtgg gctctggagc 960 cggactgctg gtcttctgta aaagccctgg ctacaaaccc atcaagaaga ttcagttaca 1020 aggcggcatc acttctatca cacttcgagg agaaggacac cagtttctcg taggaacaga 1080 agaatcgcac atttatcgtg tcagcttcac ggatttcaaa gagacgctca tagcgacttg 1140 tcactttgat gctgtcaagg atattgtctt tccattgttc aagcgattct cctgcctcag 1200 cctcctgagt agctgggatt acagtggcac tgctgagcta tttgcaacct gtgccaagaa 1260 ggatatcagg gtgtggcaca catcatccaa cagggagctg ctgcggatca ccgtgcccaa 1320 catgacctgc cacggcatcg acttcatgag ggacggcaaa agcatcattt cagcatggaa 1380 cgacggtaaa atccgagcct tcgccccaga gacaggccga ctgatgtatg tcattaacaa 1440 tgctcacagg atcggcgtca ccgccatcgc caccaccagt gactgtaaaa gggtcatcag 1500 tggcggtggg gaaggggagg tgagggtatg gcagataggc tgtcagaccc agaagctgga 1560 ggaggccctg aaggaacaca agtcatcagt gtcctgcatt agggtgaaga ggaacaacga 1620 ggagtgtgtc accgccagca ccgatgggac ttgtatcatt tgggaccttg tgcgtctcag 1680 gaggaatcag atgatactag ccaacacctt attccagtgt gtgtgctatc accctgagga 1740 gttccagatc atcaccagcg gaacagacag aaagattgct tactgggaag tatttgatgg 1800 gacagtaatc agagaattgg aaggttccct gtctgggtcg ataaatggca tggatatcac 1860 acaggaaggg gtgcactttg tcacaggtgg aaatgaccat ctggtcaaag tttgggatta 1920 taatgagggt gaagtgactc acgttggggt gggacacagt ggcaacatca cacgcatccg 1980 cataagtcca ggaaatcaat atattgttag tgtaagtgcc gatggagcca ttttgcgatg 2040 gaagtaccca tatacctcct gaagctgatg agatgtctct gagccttggc gttgcacgca 2100 gtcctgttga agactgagtt tagataactc caacactagt cttcatttct cacagctctg 2160 tttttgttct tgagtcaatt tttctctttt tctttataga atgcatttta tattcttaaa 2220 ttgcatatta aaattgaagt atgttcaaga aaaaaaaaa 2259 31 2974 DNA Homo sapiens misc_feature Incyte ID No 55140256CB1 31 caggctggtc tcaaactcct gacctcaggt gatctgcccc gcctaagcct cccaaagagc 60 tgggattaca ggcgtgagcc acagtgcctg gccaattttt gtattttttt atagagatgg 120 ggtttcgcca tgttggccag gctagtctca aactcctggg ctcaagtgat ctgcctgcct 180 cagcctccca aagtgttggg attacaggca ggagccactg ctctgggcct ttatgcttct 240 gttcaagctg attttccttt tatcaagccc caactcagcc tagccccacg ctgcctccct 300 caggagtgct tcattgactg ctgctgcctt tccctccaag cacagttgca tgctgctgct 360 tttcatgcat gtgtctctgt ccatcctggg atgttcccgt ctactccttc agaagggtga 420 ttttgtgtcc tctttcatgg aatctgtccc tagtggactg gccaaggccc atgggaggga 480 ctcgggggaa ggattgactg ctctctcctg atgcccctgt cagtggctct tggtgtggct 540 gccatcaatc aagccatcaa ggagggcaag gcagcccaga ctgagcgggt gttgaggaac 600 cccgcagtgg cccttcgagg ggtagttccc gactgtgcca acggctacca gcgagccctg 660 gaaagtgcca tggcaaagaa acagcgtcca ggtaatggcc ccaacctgac cctcctcttc 720 gctgcctgcc ccagccaccc cttgcctgct ctgaaccttt tctgcttgct tggtttctgc 780 ttagcagaca cagctttctg ggttcaacat gacatgaagg atggcactgc ctactacttc 840 catctgcaga ccttccaggg gatctgggag caacctcctg gctgccccct caacacctct 900 cacctgaccc gggaggagat ccaatcaact gtcaccaagg tcactgctgc ctatgaccgc 960 caacagctct ggaaagccaa cgtcggcttt gttatccagc tccaggcccg cctccgtggc 1020 ttcctagttc ggcagaagtt tgctgagcat tcccactttc tgaggacctg gctcccagca 1080 gtcatcaaga tccaggctca ttggcggggt tataggcagc ggaagattta cctggagtgg 1140 ttgcagtatt ttaaagcaaa cctggatgcc ataatcaaga tccaggcctg ggcccggatg 1200 tgggcagctc ggaggcaata cctgaggcgt ctgcactact tccagaagaa tgttaactcc 1260 attgtgaaga tccaggcatt tttccgagcc aggaaagccc aagatgacta caggatatta 1320 gtgcatgcac cccaccctcc tctcagtgtg gtacgcagat ttgcccatct cttgaatcaa 1380 agccagcaag acttcttggc tgaggcagag ctgctgaagc tccaggaaga ggtagttagg 1440 aagatccgat ccaatcagca gctggagcag gacctcaaca tcatggacat caagattggc 1500 ctgctggtga agaaccggat cactctgcag gaagtggtct cccactgcaa gaagctgacc 1560 aagaggaata aggaacagct gtcagatatg atggttctgg acaagcagaa gggtttaaag 1620 tcgctgagca aagagaaacg gcaggaacta gaagcatacc aacacctctt ctacctgctc 1680 cagactcagc ccatctacct ggccaagctg atctttcaga tgccacagaa caaaaccacc 1740 aagctcatgg aggcagtgat tttcagcctg tacaactatg cctccagccg ccgagaggcc 1800 tatctcctgc tccagctgtt caagacagca ctccaggagg aaatcaagtc aaaggtggag 1860 cagccccagg acgtggtgac aggcaaccca acagtggtga ggctggtggt gagattctac 1920 cgtaatgggc ggggacagag tgccctgcag gagattctgg gcaaggttat ccaggatgtg 1980 ctagaagaca aagtgctcag cgtccacaca gaccctgtcc acctctataa gaactggatc 2040 aaccagactg aggcccagac agggcagcgc agccatctcc catatgatgt caccccggag 2100 caggccttga gccaccccga ggtccagaga cgactggaca tcgccctacg caacctcctc 2160 gccatgactg ataagttcct tttagccatc acctcatctg tggaccaaat tccgtatgtg 2220 caacagcccc accccagcct acgcatggga tttcaagagg gccaggccac catggggagc 2280 agagcagggt gatagtcctg cagggaggaa ggcagggagg gagatggagc caggggtact 2340 cagaggggtt cctgaagaga tgtggggaat gctcgaagct cctgctagag agctgagctg 2400 agagacggcg tgcaagtctg tgtgtggcct ggtaggatgt cagtggaacc agaggaggga 2460 ccctaaaatg atctcaggga cactttatta gctaagagca aatgaggaga aaatgggact 2520 cttcagttcc caatatcaaa aataaatatt ttaagcccat aaagtacttt atagttctct 2580 cttttcttaa aatttagtca atacgtttgg ttttctacat ttatttttga cacatgataa 2640 ttgtacatat tcatagggta tagtgtgatg ttttgatacg tgtatacatt gtgcaatgaa 2700 tgatcaaatc agggtaatta gcatatctat tacctcaaac atttatttct ttttggtgag 2760 aacattcaaa atccttctag ctattttgag atatacaaca cattattatc aactatagtc 2820 accttctgtg caatagagca ccagaactta ttctttctac ctggatgtaa ctttgtaccc 2880 attgaccaac ctctccccat cctctctccc cttctcctac cctccccaga ctccggtaac 2940 caccattcta ctctctactc ctgtgaaatc acct 2974 32 5121 DNA Homo sapiens misc_feature Incyte ID No 2744344CB1 32 gctacctaag gcgtgaggct acgagcggtc ggctgtggca gcttctcttg tctctgacgg 60 cttgtagtta tggggcagga gccgcggacg ctgccgccct cccccaactg gtactgcgcc 120 cgctgcagcg atgccgtgcc cgggggcctc tttggcttcg ccgcgcggac ctccgtcttc 180 cttgtccgcg tgggcccggg cgcaggcgag agtccaggga cacccccgtt tcgagtcata 240 ggagagttgg tgggacacac cgaaagggtc tctggcttca cattttccca tcaccctggt 300 cagtacaacc tctgtgccac cagctccgac gatgggactg tgaaaatatg ggatgtagag 360 acaaaaacag ttgtgacaga acatgcactc catcagcata cgatatcaac attacattgg 420 tctcctcgag taaaggactt aatagtatct ggggatgaaa aaggagtagt tttctgttac 480 tggtttaaca gaaatgacag ccagcacctc tttatagaac ccaggacaat tttctgtctt 540 acttgttcac ctcatcatga agatttagta gccattggct acaaggatgg catagtggtg 600 ataattgaca tcagtaagaa aggagaagtt attcataggc ttcgaggcca tgatgatgaa 660 atccactcca tagcctggtg tcccctgcct ggtgaagatt gtttatctat aaaccaagag 720 gaaacttcag aagaagctga aattaccaac gggaatgctg tagcacaagc tccagtaaca 780 aaaggttgct acttagccac tggaagcaaa gatcaaacca ttcgaatctg gagctgttct 840 agaggccgag gggtgatgat tttgaaattg ccctttctga agagaagagg agggggtata 900 gacccaactg ttaaagagcg cctttggttg acactccatt ggcccagcaa tcaaccaaca 960 cagctggtat ctagctgttt tggaggtgaa ctgttgcaat gggatctcac tcaatcttgg 1020 agacggaaat acaccctctt cagtgcctca tcagaagggc aaaatcattc aagaattgtg 1080 tttaatttat gtcctttaca aacagaggat gacaaacagc tgttactttc tacatcaatg 1140 gatagagatg taaaatgttg ggacatagcc accttggagt gcagctggac ccttccttcc 1200 cttggtgggt ttgcatacag cctggctttc tcttctgtgg acataggctc tttggccata 1260 ggtgttgggg atggcatgat ccgtgtatgg aatacactct ccataaagaa caactatgat 1320 gtgaaaaatt tttggcaagg cgtgaagtcc aaggttacag cgctgtgctg gcacccaacc 1380 aaggaaggtt gcttagcttt tggaactgat gatggaaaag tgggattgta tgacacctac 1440 tccaacaagc ctccacagat ttctagcaca tatcataaga agactgtata tacgttagcc 1500 tgggggccac cagtaccccc catgtcactt ggaggagaag gagacagacc ttcccttgct 1560 ttatacagct gtggaggaga agggattgtc ttacagcata atccctggaa gcttagtgga 1620 gaagcctttg acatcaacaa actcatcagg gacaccaatt caatcaaata caaattgcct 1680 gtacacacag agataagttg gaaagcagat ggcaaaatca tggctcttgg caatgaagat 1740 ggatcaatag aaatatttca gattcccaac ctgaaactga tctgtactat ccaacagcat 1800 cacaagcttg tgaataccat tagctggcat catgagcatg gcagccagcc agaattgagc 1860 tatctgatgg cctctggctc caacaatgca gtcatttacg tgcacaacct gaagactgtc 1920 atagagagca gccctgagtc tccagtgacc attacagagc cctaccggac cctctcaggg 1980 catacggcca agattaccag tgtggcgtgg agcccacatc atgatggaag gctggtatct 2040 gcttcctatg atggtacagc ccaggtgtgg gatgctctcc gggaagagcc cctgtgcaat 2100 ttccgaggac atcaaggtcg actgctttgt gtggcatggt ctcctttgga tccagactgc 2160 atctattcag gggcagatga cttttgtgtg cacaagtggc tcacttccat gcaagatcat 2220 tcccggcctc ctcaaggcaa aaaaagtatt gaattggaga aaaaacggct ctctcaacct 2280 aaggcaaagc ccaaaaagaa gaaaaagccc accttgagaa ctcctgtaaa gctggaatcg 2340 attgatggaa atgaagaaga aagcatgaag gagaactcag gacctgttga gaatggtgtg 2400 tcagaccaag aaggggagga gcaagcacgg gagccggaat taccctgtgg ccttgctcca 2460 gcggtttcta gagaaccagt tatctgcact ccagtttcct caggctttga aaagtcaaaa 2520 gtcaccatta ataacaaagt cattttactg aaaaaggagc caccaaaaga gaagccagaa 2580 accttaatca agaagagaaa agctcgttcc ttgcttcccc tgagtacaag cctggaccac 2640 agatccaaag aggagcttca tcaggactgt ttggtactag caactgcaaa gcactccaga 2700 gagctgaatg aagatgtgtc tgctgatgtt gaggaaagat ttcatctggg gcttttcaca 2760 gacagggcta ccctgtatag aatgattgat attgaaggaa aaggtcactt agaaaatggc 2820 caccctgagt tatttcacca gcttatgctt tggaaaggag atctcaaagg tgttctccag 2880 actgcagcag aaagagggga gctgacagac aaccttgtgg ctatggcacc agcagctggc 2940 taccatgtgt ggctatgggc tgtggaagct tttgccaaac agctgtgttt tcaggatcag 3000 tatgtcaagg ctgcttctca cctactttcc atccacaaag tgtatgaagc ggtggagctg 3060 ctcaagtcaa accattttta cagggaagct attgcgattg ccaaggcccg gctgcgcccg 3120 gaggacccag tcctgaagga cttgtacctc agctggggaa ccgtcctaga aagagatggc 3180 cactatgctg tagctgccaa atgctattta ggggccactt gtgcttatga tgcagccaaa 3240 gttttggcca aaaaggggga tgcggcatca cttagaacgg ctgcagagtt ggctgccatc 3300 gtaggagagg atgagttgtc tgcttccctg gctctcagat gtgcccaaga gctgcttctg 3360 gccaacaact gggtgggagc ccaggaagcc ctgcagctgc atgaaagtct acagggtcag 3420 agattggtgt tttgccttct ggagctactg tccaggcatc tggaggaaaa gcagctttca 3480 gagggcaaaa gctcctcctc ttaccacact tggaacacgg gcaccgaagg gcctttcgtg 3540 gagagggtga ctgcagtgtg gaagagcatc ttcagccttg acacccctga gcagtatcag 3600 gaagcctttc agaagctgca gaacatcaag tacccatctg ctacaaataa cacacctgcc 3660 aaacagctcc tgcttcacat ttgccatgac ttgaccctgg cagtgctgag ccaacagatg 3720 gcctcctggg acgaggctgt gcaggcgctc cttcgggcgg tggtccggag ctatgactca 3780 gggagcttca ccatcatgca ggaagtgtac tcagcctttc tccctgatgg ctgtgaccac 3840 ctaagagaca agttggggga ccatcaatcc cctgccacac cagctttcaa aagtttggag 3900 gccttttttc tttatgggcg tctgtatgaa ttctggtggt ctctctccag accttgccca 3960 aattccagtg tctgggtaag ggctggtcac agaacactct ctgttgagcc aagccagcag 4020 ttagacactg ccagcactga agaaacggac cctgaaactt ctcagccaga gccaaacagg 4080 ccttcagaac tagacttgag actcacagaa gaaggtgagc gaatgctgag tacttttaag 4140 gagctctttt cagaaaagca tgccagtctc caaaactcac agagaactgt tgctgaagtc 4200 caagagacct tggcagaaat gatccgacaa caccaaaaga gtcaactctg taaatccaca 4260 gcaaatggtc ctgataagaa tgaaccggaa gtagaagcag agcagcccct ctgcagttct 4320 cagagccagt gtaaagaaga aaaaaatgag ccactttctc tgcctgagtt aaccaaaagg 4380 cttaccgagg caaatcagag aatggcaaaa tttcctgaga gcattaaggc ctggcccttc 4440 ccagatgtgc tggagtgctg cctcgtcctg cttctcatca ggtcccactt tcctggctgt 4500 ctggcccagg aaatgcagca gcaggcccaa gagctccttc agaaatacgg caacacgaaa 4560 acttacagaa gacactgcca gaccttctgt atgtgaattt tcacacacct tgaagaaact 4620 gccaaattga aaatgtttga catctttcac ctctgcagtt atgcctcacc agacattcac 4680 tctggtccct agatgttttt gcagtaatcc aaaagaatac aaacaaggat taagtttgaa 4740 tcaaccctgc ctacccatag acaacggtgg atctgacttt agactcaatt gtggtctcct 4800 actggaggga agatcatgaa aagcccacag tagttattca gaactaacac ctgcagagtg 4860 ttggtcatct ctacagcctt aggcaggttt cacccaaaga ggagaaactt ctgtcgtcac 4920 ccaaagtgtt acatgcttaa aacacaagct acctttgtaa atacttcatc tgatcagaag 4980 tgtgtcatgc ttgtttgaga tggagttgct gcattttagg actattgata ccttttttta 5040 attgttttta taatatttaa tttgaaagag gagacccctc tctctctact ctttcataga 5100 ctgaagtttg aatatgaaat a 5121 33 3625 DNA Homo sapiens misc_feature Incyte ID No 1555147CB1 33 cgtgaccctc tcggggatcc cacgatgttc ttctacctga gcaagaaaat ttccattccc 60 aataacgtga agctgcagtg tgtatcctgg aacaaggaac aagggttcat agcatgcggt 120 ggtgaagatg gattactgaa agttttgaaa ttagagacgc agacagatga tgcaaaattg 180 aggggccttg cagcccccag taacctttct atgaatcaga ctcttgaagg tcatagtggt 240 tctgttcaag ttgtaacatg gaatgagcag tatcagaagt tgactaccag tgatgaaaac 300 gggcttatca ttgtgtggat gttatataaa ggctcttgga ttgaggagat gatcaacaat 360 cgaaataaat cagttgttcg cagtatgagc tggaatgctg acggacagaa gatctgcatt 420 gtatatgaag atggggctgt gatagttggt tcagtggatg gcaatcgtat ttggggaaaa 480 gacctgaagg gtatacagct atcccatgta acatggtctg cggacagtaa agtcttactt 540 tttggaatgg caaatgggga aatacacatt tacgataatc aaggaaattt tatgataaaa 600 atgaaactga gttgtttggt gaatgtcact ggagctatca gcattgctgg aattcattgg 660 taccatggca cagaaggcta cgtggagcct gattgccctt gccttgctgt ttgctttgat 720 aatggaagat gccaaataat gagacatgag aatgaccaaa atcccgtttt gattgacact 780 ggcatgtacg tagtaggcat ccagtggaac cacatgggca gcgtgttagc tgtggcaggc 840 ttccagaagg cagccatgca ggacaaagat gtgaacattg tgcagtttta cactccgttt 900 ggtgagcatc tgggtacttt gaaagttcct ggaaaggaaa tatctgcact atcttgggaa 960 ggaggtggac tgaaaattgc actagctgtt gattccttta tatattttgc aaacattcga 1020 cctaattata agtggggtta ttgctcaaac actgtagttt atgcatatac cagacctgat 1080 cgtccagaat attgtgttgt cttctgggat acgaaaaaca atgaaaaata tgttaaatat 1140 gtgaagggtc tcatttctat tactacctgt ggagatttct gcattttggc tacaaaagct 1200 gatgaaaatc atcctcagta ccattgtttg ttgcaatgac caaaacccat gtgatagcag 1260 cctcgaaaga agcattttat acctggcaat atcgtgtggc aaagaagctc acagcattgg 1320 aaattaatca gatcacacgg tctcgaaaag aagggagaga aagaatttat catgttgatg 1380 ataccccttc tggatcaatg gatggtgtgc ttgattatag taaaaccatt caaggcacaa 1440 gggatccaat ttgtgccata actgcatcag ataagatatt gattgtgggt cgtgaatctg 1500 gcaccattca gagatacagt ctacctaatg gtggtttgat tcaaaaatat tcccttaatt 1560 gtcgagccta ccagttatcc ttgaattgca actctagccg tcttgctatc atagacatct 1620 caggagttct gactttcttt gacttggatg ctcgagtaac ggacagtacg ggacagcaag 1680 tagttggaga gttgttaaaa ttggaacgaa gagatgtctg ggatatgaag tgggccaaag 1740 ataatcctga tttgtttgca atgatggaga agacaagaat gtatgttttc agaaacttgg 1800 atcctgagga acccattcag acctctggat atatttgtaa ttttgaggat ttagaaatta 1860 aatctgttct tttggatgag atattaaagg atccagaaca tccaaacaag gattacctaa 1920 ttaactttga gattcggtct ctgcgagata gccgagcact gattgagaag gttggaatta 1980 aagatgcatc tcagttcata gaggacaatc cacacccccg actttggcgc ctactggctg 2040 aagcagctct tcagaaactg gatctataca ctgcagagca agcatttgtg cgctgcaaag 2100 attaccaagg cattaagttt gtgaagcgct tgggcaaact actgagtgag tcaatgaaac 2160 aggctgaagt tgttggctac ttcggcaggt ttgaagaggc tgaaagaacg tatctcgaga 2220 tggacagaag ggatcttgct attggcctcc ggctgaaatt gggggattgg tttagagtac 2280 tccagctcct gaaaactgga tctggtgatg cagatgacag tctcctggaa caagccaaca 2340 atgccattgg agactacttt gctgatcgac aaaagtggtt gaatgctgta caatattatg 2400 tacaaggacg gaaccaggaa cgcttagctg aatgttacta tatgttagag gattatgaag 2460 ggttagagaa ccttgccatt tcacttccag aaaaccacaa gttacttcca gaaatagcac 2520 aaatgtttgt cagagttgga atgtgtgaac aagcagtgac tgcatttttg aaatgtagtc 2580 aaccaaaggc agcagtagat acctgcgtac atctcaacca atggaacaaa gctgttgaat 2640 tggctaaaaa tcatagtatg aaagaaattg gatctctgtt agctaggtat gcatctcatt 2700 tactggaaaa gaataaaact cttgatgcca tagaactcta tcggaaagcc aattactttt 2760 ttgatgcagc taaactgatg tttaagattg cagatgaaga ggcaaagaaa ggaagtaaac 2820 ctttacgtgt caagaagctc tatgtactgt cagccttact tatagagcaa taccatggac 2880 agatgaagaa tgcccagcga ggaaaagtta aaggaaaaag ttcagaggcc acttctgcct 2940 tggctggttt gctggaagaa gaagttctgt ctacaacaga tcgtttcaca gataatgcat 3000 ggagaggggc agaggcttac cacttcttta tacttgcaca gaggcagctc tatgagggat 3060 gtgtggacac tgcactgaag acagctcttc acctgaaaga ctatgaagac atcatccctc 3120 ctgtggagat ctactctctg ctagcactct gcgcatgcgc cagcagagcc tttgggactt 3180 gttcaaaagc tttcattaaa cttaaatctt tagagaccct cagttcagaa cagaaacagc 3240 agtatgaaga ccttgcttta gaaatcttca ccaaacatac ttcaaaagat aacagaaaac 3300 ctgaattgga cagccttatg gaagggtagg ctaattttaa ttagtggatg ccattctatc 3360 ttattagaag cttggatttc tagatgtaca atgtttaatg taggaattaa agcactgaat 3420 tttgaagcaa ttcacattaa cattataccc tattttattt atttttacaa gtgtattcat 3480 gctttatttt gtcattgtaa gaaaggtttt ttcttgaaat aactttttta aatgaaagta 3540 tttgatgttc atctcagaag ttttattctt taggtttttt ttaagtgtat taaataaagt 3600 tagactaatg agaagtttta aagta 3625 34 3988 DNA Homo sapiens misc_feature Incyte ID No 1939136CB1 34 cgaccttcga gactgaggaa gccagtcacc atgaggcatg cgtgcgcctg cggccccaga 60 cctatgacct ccaggagagc aacgtgcagc tcaagctgac cattgtggat gccgtgggct 120 ttggggatca gatcaataag gatgagaggc aagaggcggg aagggcggcc ccacccagcc 180 tcctcccacc ccacctacat tggcccctat aacagtagcc cagccctcac actgcagggg 240 gccagggagg gcctcttggg gaatatctga ggctctgtgg tcaccaacag accagttact 300 cctttaggtg tctggagaag gggtcagctg cctgtatcca gtcagggatc tcaggcagaa 360 gctgttccca gaaagaaaag gccagggggc agcctggctt ggccccgagc cctgagcccc 420 ccaagcccca agcccctgat ctcagctggc agcctcctgg gtgatggagc tgtctgtagt 480 tacaggccca tagttgacta catcgatgcg cagtttgaaa attatctgca ggaggagctg 540 aagatccgcc gctcgctctt cgactaccat gacacaagga tccacgtttg cctctacttc 600 atcacgccca cagggcactc cctgaagtct ctagatctag tgaccatgaa gaaactagac 660 agcaagatcc tggcaggcga acattattcc catcatcgcc aaggctgaca ccatctccaa 720 gagcgagctc cacaagttca agatcaagat catgggcgag ttggtcagca acggggtcca 780 gatctaccag ttccccacgg atgatgaggc tgttgcagag attaacgcag tcatgaatgc 840 acatctgccc tttgccgtgg tgggcagcac cgaggaggtg aaggtgggga acaagctggt 900 ccgagcacgg cagtacccct ggggagtggt gcaggtggag aatgagaatc actgcgactt 960 cgtgaagctg cgggagatgt tgatccgggt gaacatggaa gacctccgcg agcagaccca 1020 cagccggcac tacgagctct accggcgctg caagttggag gagatgggct ttcaggacag 1080 cgatggtgac agcccggcgg cggggctccg gctgcgctcg tggccgggcc gggcggggag 1140 gccggtcccg cgggcggggg caggggcggc tccgcggctt ctcccgccgc cgccgccaag 1200 gggagtttcc aggaagtggc catattggat ccattcagcc gcagccgccc gggcggagcg 1260 cgtcccgcag ccggctggtc cctgtcgctg cccctgcgct cgtcccagcc cacccgcccg 1320 gtgcggagct cgccatggcg gccaccgacc tggagcgctt ctcgaatgca gagccagagc 1380 cccggagcct ctccctgggc ggccatgtgg gtttcgacag cctccccgac cagctggtca 1440 gcaagtcggt cactcagggc ttcagcttca acatcctctg tgtgggggag accggcattg 1500 gcaaatccac actgatgaac acactcttca acacgacctt cgagactgag gaagccagtc 1560 accatgaggc atgcgtgcgc ctgcggcccc agacctatga cctccaggag agcaacgtgc 1620 agctcaagct gaccattgtg gatgccgtgg gctttgggga tcagatcaat aaggatgaga 1680 gttacaggcc catagttgac tacatcgatg cgcagtttga aaattatctg caggaggagc 1740 tgaagatccg ccgctcgctc ttcgactacc atgacacaag gatccacgtt tgcctctact 1800 tcatcacgcc cacagggcac tccctgaagt ctctagatct agtgaccatg aagaaactag 1860 acagcaaggt gaacattatt cccatcatcg ccaaggctga caccatctcc aagagcgagc 1920 tccacaagtt caagatcaag atcatgggcg agttggtcag caacggggtc cagatctacc 1980 agttccccac ggatgatgag gctgttgcag agattaacgc agtcatgaat gcacatctgc 2040 cctttgccgt ggtgggcagc accgaggagg tgaaggtggg gaacaagctg gtccgagcac 2100 ggcagtaccc ctggggagtg gtgcaggtgg agaatgagaa tcactgcgac ttcgtgaagc 2160 tgcgggagat gttgatccgg gtgaacatgg aagacctccg cgagcagacc cacagccggc 2220 actacgagct ctaccggcgc tgcaagttgg aggagatggg ctttcaggac agcgatggtg 2280 acagccagcc cttcagccta caagagacat acgaggccaa gaggaaggag ttcctaagtg 2340 agctgcagag gaaggaggaa gagatgaggc agatgtttgt caacaaagtg aaggagacag 2400 agctggagct gaaggagaag gaaagggagc tccatgagaa gtttgagcac ctgaagcggg 2460 tccaccagga ggagaagcgc aaggtggagg aaaagcgccg ggaactggag gaggagacca 2520 acgccttcaa tcgccggaag gctgcggtgg aggccctgca gtcgcaggcc ttgcacgcca 2580 cctcgcagca gcccctgagg aaggacaagg acaagaagaa cagatcagat ataggagcac 2640 accagccggg catgagcctc tccagctcta aggtgatgat gaccaaggcc agtgtggagc 2700 ccttgaactg cagcagctgg tggcccgcca tacagtgctg cagctgcctg gtcagggatg 2760 cgacgtggag ggaaggattc ctctgaggca gcagctccaa cacatggggc cagctcagga 2820 ccaccagggc atggaactgg agaccatggt ttttaatgtt agaacagaaa acgccatact 2880 tttcctatat caatgatcaa aagtgcaaac aatttaaatt tccatcaggg aacatcaaat 2940 gttgcccaac ccttttcatt cctatccatg gctccgtaag gggcttgagg cttaatgccc 3000 atcctgtggc caagctgagc ttccactccg ggaccaaaaa aaaaaaaaaa gtctgctttg 3060 tgacatcatc gttatgagcg gaaagtacct agatgacaat gtttccattc tgaaaaatag 3120 aaacatacta ttcaagacca aggtagcaga aaagttactt gtatctgctt atcataagac 3180 gaaactctgc aacttggcaa cggtggccag ttttcgtaat gaaacagtct ttagtaattt 3240 aatcttcatg cttcataaca aaccaaaacc ccatgagatt tccacattgc ataattttgc 3300 cttactaaca gaatcatatc cttaaggatg accatcattc ccccaactaa aacaaataca 3360 aactaatgta tgatattttt ttaagtgcca gatcaatatg gtctaaagct tcaataagga 3420 ttgtgtgtag gtgaataaag acagctaagt gaatgtgtgt aaagtgtagc aaaagcagac 3480 agatatttat gtacagtatt catagaatgg aaagttaaat atttttgcag tgtgtattta 3540 aaagagaaac tcaccataat agtgccgtct aaaaatcttt gtaaagttaa tttaatgtcc 3600 tttagaagtg ggagtctggt ggaactgtgt tggatttaag ataccttttc actcttccgt 3660 atgtcatgag ccttgtgcgt cacctcactg tggtgcatgt gcaagggcgt gtgcacgcct 3720 gtgctttgcc atcccatgtt gtaaacagct gttccaaagg cacaaacgag tttagggtag 3780 actctgtaaa cacctcctta ctcactatag tcaagaagtc cagcggcgtc ccaatataga 3840 ggtcccagtg cagtctgtcc agaatagcca gctccatcct cagcagctca ttcggggaat 3900 agtcagagcc atagtgcttt gtgaagtctt ttacttgtgg aataaactgt aaaaagaaaa 3960 taaagaggcc aaagccctaa aaaaaaaa 3988 35 4169 DNA Homo sapiens misc_feature Incyte ID No 5956978CB1 35 tttctttaag gaagggcatg ttacctataa taccaaacca caaaaggata gctgcggttt 60 tgggcgagga gagctcagag agtttcttgc atatggccct gtgatggcgg ccatggccct 120 gcatagacac gagctggaat ctgcaggtgg cagccaggac gctgcgtgtg tcgagtgcac 180 agtgtggctt ggtgccaacc atggcgaggg tggagagccc cgtgcctgca gcgcgcgctt 240 ccctcactgg gtcctgcgtc cttgggcagg cgatgcccct gcggggaggg gctggtccat 300 ccccggccag ccacggaccc acgcatggac ccagcgaccc acggacctgc ttacctgggc 360 gcggcgcggg tggcatgcgg ccacacggaa ggggcgcgct gggctgctgc ggcctctgca 420 gcttctacac ctgccacggg gcggccggag atgaaatcat gcaccaggac atcgtcccgc 480 tctgtgctgc cgacatccag gaccagctaa agaagcgctt tgcttacctg tccggtgggc 540 gggggcagga cggaagcccg gttatcacct tccctgacta cccggccttc agcgagattc 600 cggacaagga gttccagaat gtcatgacct acctcaccag catccccagc ctgcaggacg 660 ctggcatcgg attcatcctg gtgatagacc ggcgacggga caaatggacc tccgtgaagg 720 cgtccgtcct gcgcatcgca gcatctttcc cggcaaacct gcagctcgtc ctcgtgcttc 780 gcccgacggg ttttttccaa aggactctct ccgacatcgc tttcaaattc aatagagatg 840 actttaagat gaaggtgccg gtcataatgc tgagctccgt accagactta cacggttaca 900 tcgataagtc gcagctgacc gaggacctgg gtgggaccct ggactactgc cactcccggt 960 ggctgtgcca gcgcacggcc atcgaaagtt tcgccctcat ggtgaagcag acggctcaga 1020 tgctgcagtc cttcgggacc gagctggctg aaacagagct gcccaatgac gtccagtcga 1080 caagctcagt gctgtgtgcg cacacagaga agaaggacaa ggcgaaggag gatttgaggc 1140 tggcactgaa agaggggcac agtgtcctgg agagcctcag ggagctgcag gctgagggct 1200 cagagcccag tgtgaaccag gaccagcttg acaaccaggc caccgtgcag aggctcctgg 1260 cccagctgaa cgaaaccgag gctgccttcg atgagttctg ggcaaagcat cagcagaaac 1320 tggagcagtg tctgcagctc cggcactttg agcagggctt ccgggaggtc aaagccatct 1380 tggacgcagc gtcccagaag atagcaacct tcacagacat cggcaacagc ctggcgcatg 1440 tggagcacct gctgagggac ctggccagct tcgaggagaa atcaggcgtg gccgtggaga 1500 gggcccgggc cctgtctctg gacggcgagc agctcattgg gaacaagcac tacgcggtag 1560 actccatccg cccaaagtgc caggagctcc ggcacctctg tgaccagttc tctgcggaga 1620 tcgcaaggag gagggggctg ctcagcaagt ccctggagct gcaccgccgc ctggagacgt 1680 ccatgaagtg gtgtgatgaa gggatttacc tgctggcctc acaacctgtg gacaagtgcc 1740 agtcccagga cggcgcggag gctgccctcc aggaaatcga gaagtttttg gagaccggtg 1800 cggaaaataa gatccaggag ctcaacgcga tttacaagga atacgaatcc atcctcaacc 1860 aagatctcat ggagcacgtg cgaaaggtct tccagaagca ggcaagcatg gaggaggtgt 1920 tccaccgcag gcaggccagc ctgaagaagc tggcggccag gcagacgcgg cccgtgcagc 1980 cggtggcccc cagacccgag gcactggcaa agtcgccctg cccctcccca ggcattcggc 2040 gaggctctga gaactccagc tccgagggcg gtgcgctccg gagagggccc taccggaggg 2100 ccaagagtga gatgagtgag agccggcagg gccgcggctc agcgggggag gaggaggaaa 2160 gcctggccat cctgcgcagg cacgtgatga gcgagctcct ggacacagaa cgggcctacg 2220 tggaggagct gctgtgcgtc ctggagggct acgccgcgga gatggataac ccactgatgg 2280 ctcacctcct gtcaacaggc cttcacaaca agaaggatgt tttgtttgga aacatggagg 2340 aaatctatca cttccacaac aggatattcc tcagggagct ggaaaactac actgactgcc 2400 cagaactggt tggaagatgc tttctggaga ggatggaaga tttccagatc tatgagaagt 2460 actgtcagaa caagccccgc tctgagagcc tgtggagaca gtgctccgac tgcccgtttt 2520 tccaggaatg ccagagaaag ctggaccaca agctgagcct ggactcctac cttctgaagc 2580 cagtgcagag gatcaccaag taccagctgc tgctcaagga aatgctgaaa tacagcagga 2640 actgcgaggg ggctgaggac ctgcaggagg cgctgagctc catcctgggc atcctgaagg 2700 ccgtgaacga ctccatgcac ctcatcgcta tcaccggcta tgacgggaat ctcggcgacc 2760 tgggcaagct gctgatgcag ggctcattca gcgtctggac cgaccacaag aggggccaca 2820 ccaaggtgaa ggagctggcc aggttcaagc ccatgcagcg gcacctgttc ctgcacgaga 2880 aggcagtgct cttctgcaag aagagggagg agaatgggga ggggtatgag aaagctccct 2940 cctacagcta caagcagtcc ttaaacatgg ctgccgttgg cattacggag aacgtgaagg 3000 gagatgctaa gaagttcgag atctggtaca acgcgcgcga ggaggtctac atcgtccagg 3060 cgccaactcc tgagattaaa gccgcgtggg tgaatgaaat tcggaaagtg ctgaccagcc 3120 agctgcaggc ttgtagagaa gccagccagc accgggcgct ggagcagtca cagagcctgc 3180 ccctgccggc cccgaccagc accagtccct caagaggaaa ctcaaggaac atcaagaagc 3240 tggaagaaag gaaaacagac cccctaagcc tggagggata cgtcagctca gcgccactga 3300 caaagccccc cgaaaagggc aaaggttgga gcaaaacgtc ccactcactg gaggcacctg 3360 aggacgacgg gggctggtca agtgcagagg agcagattaa ctcgtccgac gcagaggagg 3420 acggcgggtt gggccccaag aagctggttc caggtaaata cacggtcgtg gcggaccacg 3480 agaagggagg ccccgatgcg ctgcgcgtga ggagcgggga cgtggtggag ctggtgcagg 3540 agggcgacga gggcctctgg taagaccccg cgctcagccc cggactgccc cgcacgtggc 3600 tgccgctgac cctcgcccct tgcagaaaca tattctgttc aaaacttttg aagccctttc 3660 ggtgtctagt ctgcagatgt ttttgtatgt gtgcacctct gaccatgtgt gtacatatgt 3720 gtcttgctgg aaaggacata ttcgctgtcc ccgtgctgct gggagggccg cctcacagcc 3780 tcacggttcc cagccccagc acagtggagg caggcgtggc tgcattcccc tcacgctacc 3840 ctcccagcgg cttgtagccg tcactggcca gacctccagg gtgcggaatc aaataggaag 3900 catgcagaga ctcggcagct tttcctctga tgtgtaagtt atttggaacg cgtgctgtgt 3960 cccgcgatgt ccctgatgta ctgtgcaggc gcggtgcctc cgtctcgtcg cacagctgcg 4020 cgcccttgtg tgaccctccc cataaaggca ctttacagct tcatgtttca tccactgtca 4080 ctttttttta actgctgatg taaatggaat tttaaaagca gagttcttta ttgtatggat 4140 gacgtttgaa taaatatcag caactccta 4169 36 2063 DNA Homo sapiens misc_feature Incyte ID No 7662817CB1 36 gtttcttgtg ttaactataa atgcgggtca gagtgaaagc cgagtaggga ggagtgaaaa 60 gggggaaaag ggaagagata agaggaagaa aagatgaaag agccggcgga agagggaggg 120 aagcgggggc gggcaggcga gcagaggcag ggaaagagcg gcgcgggttg cgggaaagct 180 ccgcgcgcag tacccaaccg aggcccgggc gcgcggtcct cccgccggca gggggcgccc 240 acccgcccgc ctccggcccc tcgcggccag ccgccccgcc cgcgcctgcg cttcctgccc 300 ctcctcgtcc gggatgctgc tgccgctact gcggagtagc tgcttccctt cctcctctcc 360 cggcggcggc ggcggcagcg gcggaggagg aggaggaggg gacccgggcg cagagagccg 420 gccggcggcg cagttgcagc gcggagccca cgggccgccg gggccgctcc agcgggcgga 480 agccgagccc gggggccgac cccccgcgcg cggcggaggc cgagggggcg ccggggcccg 540 gggcgcgcag ccggggggcg ggcggcggcg ggcggcgggc cgagcgggag ccgcatgcgt 600 gtatggatcc gggcgccgcg ctgcagaggc gggccggggg cggcggcggt ctgggcgcgg 660 gctccccggc gctgtcgggc ggccagggcc gccggaagaa gcagcccccc aggccggccg 720 acttcaagtt gcaggtcatc attatcggct cccgcggcgt gggcaagacc agcctgatgg 780 agcgcttcac cgacgacacc ttctgcgagg cctgcaagtc caccgtgggt gttgacttca 840 aaatcaaaac tgtagagcta agaggaaaga aaattagatt acagatctgg gacacagcag 900 gtcaggagag attcaacagc attacctcag cttattacag aagtgccaag gggatcatat 960 tagtatatga tatcactaag aaggagacat ttgatgattt gccgaaatgg atgaagatga 1020 ttgataagta tgcttcagaa gatgcagagc ttctcttagt tggaaataag ttggactgtg 1080 aaacggacag agaaatcacc aggcagcagg gggaaaagtt tgcacagcag atcactggga 1140 tgcggttctg tgaagcaagt gccaaggata acttcaatgt ggacgagata tttttgaaac 1200 ttgtcgatga cattctgaaa aagatgcctc tggatatttt aaggaatgag ttgtccaata 1260 gtatcctgtc gttacaacca gagcctgaga taccgccaga actgcctcca ccaagaccac 1320 atgtccgatg ctgttgattt cctactttgg agacaaagtg gaaatgattc ctggaaaggg 1380 gaaaaaacgt tctattctgc actacaatca ttttgacaat ttcctttcgc actttgtaat 1440 ccaagtcaga gctatacact aacttgtaaa tatgcatata tgcaatcctg ggtaagtttt 1500 ggttataagt tacctatttc cctccaaatt attatatttc attcattacc ccagtgtcta 1560 gtgtacatac actgggaaac ctagtacttc taatatgaag aatgggagaa atgaaaggta 1620 taatgtttct tgaaataaat aatataattg tccttattaa ttatattatg aggacagaag 1680 atattctgat aagagagaac gtggtgcttt gcttaccgtt ttaaagaaaa tttgtaaaac 1740 taaagacttt ttgaaaaaaa gctatcttaa gtgctttttc tttatttaca agacatttcc 1800 cccagtggta gcatctgaag tattggagtg tttctgccac gaagcaaagc tccattcatg 1860 gccgtcatgg aaggttattt attaatgtta cataatggta gaatattact agttagaggg 1920 ttggatttga cttggtccta aggccacaga atctctctca tggcttccta agggatgtac 1980 ctttatgctt ttaagaacta caaagattca ataaagaaag aaatgttttt gaaactatag 2040 aaaaagattt taaaacacgc tgc 2063 37 6382 DNA Homo sapiens misc_feature Incyte ID No 55139221CB1 37 gcgcctcccc gggccactga cgcccggcgc gctctccccc cgcggcggcg gccgaagcac 60 ggggaggcgg cggcggcggc ggcggcccga gcccagcccc atggcgcggg gggacgccgg 120 ccgcggccgc gggctcctcg cgttgacctt ctgcctgttg gccgcgcgcg gtgagctgct 180 gttgccccag gagacgactg tggagctgag ctgtggagtg gggccactgc aagtgatcct 240 gggcccagag caggctgcag tgctaaactg tagcctgggg gctgctgccg ctggaccccc 300 caccagggtg acctggagca aggatgggga caccctgctg gagcacgacc acttacacct 360 gctgcccaat ggttccctgt ggctgtccca gccactagca cccaatggca gtgacgagtc 420 agtccctgag gctgtggggg tcattgaagg caactattcg tgcctagccc acggccccct 480 cggagtgctg gccagccaga ctgctgtcgt caagcttgcc agtctcgcag acttctctct 540 gcacccggag tctcagacgg tggaggagaa cgggacagct cgctttgagt gccacattga 600 agggctgcca gctcccatca ttacttggga gaaggaccag gtgacattgc ctgaggagcc 660 tcggaggctc atcgtgcttc ccaacggcgt ccttcagatc ctggatgttc aggagagtga 720 tgcaggcccc taccgctgcg tggccaccaa ctcagctcgc cagcacttca gccaggaggc 780 cctactcagt gtggcccaca gaggttccct ggcgtccacc agggggcagg acgtggtcat 840 tgtggcagcc ccagagaaca ccacagtggt gtctggccag agtgtggtga tggaatgtgt 900 ggcctcagct gaccccaccc cttttgtgtc ctgggtccga caagacggga agcccatctc 960 cacagatgtc atcgtcctgg gccgcaccaa cctactaatt gccaacgcgc agccctggca 1020 ctccggcgtc tatgtctgcc gcgccaacaa gccccgcacg cgcgacttcg ccactgcagc 1080 cgctgagctc cgtgtgctgg cggctcccgc catcactcag gcgcccgagg cgctgtcgcg 1140 gacgcgggcg agcacagcgc gcttcgtgtg ccgcgcgtcg ggggagccgc ggccagcgct 1200 gcgctggctg cacaacgggg cgccgctgcg gcccaacggg cgcgtcaagg tccagggcgg 1260 cggtggcagc ctggtcatca cacagatcgg cctgcaggac gccggctact accagtgcgt 1320 ggctgagaac agcgcgggaa tggcgtgcgc tgccgcgtcg ctggccgtgg tggtgcgcga 1380 ggggctgccc agcgccccca cgcgggtcac tgctacgcca ctgagcagct ccgctgtgtt 1440 ggtggcctgg gagcggcccg agatgcacag cgagcagatc atcggcttct ctctccacta 1500 ccagaaggca cggggcatgg acaatgtgga ataccagttt gcagtgaaca acgacaccac 1560 agaactacag gttcgggacc tggaacccaa cacagattat gagttctacg tggtggccta 1620 ctcccagctg ggagccagcc gcacctccac cccagcactg gtgcacacac tggatgatgt 1680 ccccagtgca gcaccccagc tctccctgtc cagccccaac ccttcggaca tcagggtggc 1740 gtggctgccc ctgcccccca gcctgagcaa tgggcaggtg gtgaagtaca agatagaata 1800 cggtttggga aaggaagatc agattttctc tactgaggtg cgaggaaatg agacacagct 1860 tatgctgaac tcgcttcagc caaacaaggt gtatcgagta cggatttcgg ctggtacagc 1920 agccggcttc ggggccccct cccagtggat gcatcacagg acgcccagta tgcacaacca 1980 gagccatgtc ccttttgccc ctgcagagtt gaaggtgcag gcaaagatgg agtccctggt 2040 cgtgtcatgg cagccacccc ctcaccccac ccagatctct ggctacaaac tatattggcg 2100 ggaggtgggg gctgaggagg aggccaatgg cgatcgcctg ccagggggcc gtggagacca 2160 ggcttgggat gtggggcctg tccggctcaa gaagaaagtg aagcagtatg agctgaccca 2220 gctagtccct ggccggctgt acgaggtgaa gctcgtggct ttcaacaaac atgaggatgg 2280 ctatgcagca gtgtggaagg gcaagacgga gaaggcgccg gcaccagaca tgcctatcca 2340 gaggggacca cccctgcctc cagcccacgt ccatgcggaa tcaaacagct ccacatccat 2400 ctggcttcgg tggaaaaagc cagatttcac cacagtcaag attgtcaact acactgtgcg 2460 cttcagcccc tgggggctca ggaatgcctc cctggtcacc tattacacca gttctggaga 2520 agacatcctc attggcggct tgaagccatt caccaaatac gagtttgcag tgcagtctca 2580 cggcgtggac atggatgggc ctttcggctc tgtggtggag cgctccaccc tgcctgaccg 2640 gccctccaca cccccatccg acctgcgact gagccccctg acaccgtcca cggttcggct 2700 gcactggtgc ccccccacag agcccaacgg ggagatcgtg gagtatctga tcctgtacag 2760 cagcaaccac acgcagcctg agcaccagtg gaccttgctc accacgcagg gaaacatctt 2820 cagtgctgag gtccatggcc tggagagcga cactcggtac ttcttcaaga tgggggcgcg 2880 cacagaggtg ggacctgggc ctttctcccg cctgcaggat gtgatcacgc tccaggagaa 2940 gctgtcagac tcgctggaca tgcactcagt cacgggcatc atcgtgggtg tctgcctggg 3000 cctcctctgc ctcctggcct gcatgtgtgc tggcctgcgc cgcagccccc acagggaatc 3060 cctcccaggc ctgtcctcca ccgccacccc cgggaatccc gcgctgtact ccagagctcg 3120 gcttggcccc cccagccccc cagctgccca tgaattggag tcccttgtgc acccccatcc 3180 ccaggactgg tccccgccac cctcagacgt ggaggacagg gctgaagtgc acagccttat 3240 gggtggcggt gtttctgaag gccggagtca ctccaaaaga aagatctcct gggctcaacc 3300 aagcgggctg agctgggctg gttcctgggc aggctgtgag ctgccccagg caggcccccg 3360 gccggctctg acccgggccc tgctgccccc tgctggaact gggcagacgc tgttgctgca 3420 ggctctggtg tacgacgcca taaagggcaa tgggaggaag aagtcacccc cagcctgcag 3480 gaaccaggtg gaggctgaag tcattgtcca ctctgacttt agtgcatcta acgggaaccc 3540 tgacctccat ctccaagacc tggagcctga ggaccccctg cctccagagg ctcctgatct 3600 catctcgggt gttggggatc cagggcaggg ggcagcctgg ctggacaggg agttgggagg 3660 gtgtgagctg gcagcccccg ggccagacag acttacctgc ttgccagagg cagccagtgc 3720 ttcctgctcc tacccggacc tccagccagg cgaggtgcta gaggagaccc ctggagatag 3780 ctgccagctc aaatccccct gccctctagg agccagccca ggcctgccca gatccccggt 3840 ctcctcctct gcctagctct tcccagagga tgtggtttgg ggcaggcagg tatggatcac 3900 ataggatgcg atacctgtgg ccgtgtatgt ccacatgtgt gcctgtagat acatcatcaa 3960 gccctttgga gcttcctaag ttgctttggc tgaggggaga ggaaaacatg gattattcac 4020 tccccccata ctctttgtga tacacatgtg acatgtgaaa gacatacgag acatagctac 4080 atgtgatgtg cacatgtgtg aagtgcatgt atgcgtactg gttgttgagc tgggaaaccg 4140 tggcccaggc agtggtcact acagcctgat tggtcctcca ggtcagaacg gtgccccaca 4200 gtggtcagtc cccagccctg tgggccccca cctccatcgc ccagcctttt attacacact 4260 ctgagagtgt ctccaatgcc tgtctgacaa agacagtccc agcccattct cctgtctggc 4320 tgggttgggt gcaagcaggc tctgaatgcc tggcatttca gctgcatcac ctcccagctc 4380 cttattgccc aaatagagag ggtggccctg gctcccctcc gagcaactct gcatttaatt 4440 ttgtaatctg ggaagtgcct ggttttgaaa atccgctttc tctcactctt cccctccttc 4500 cttgcccctg gctgctctag tgttctgtct cccagtcacc tcgctctccc agcaccagtg 4560 cccttctcct gctcccagat actctttcct ttcctctctc ctgttttcct tcctctgcta 4620 tctctcacac ctctcccaga ctatgtcatc ttgttctcct gcctgggttc aaactctgca 4680 tccttctcta acaacgtgac tacctcatgt ctgcttcaag gcccccgtgc ccttcctgta 4740 tccgcggctg ccgcgcactc gcctgccatc ctcctgcctc ctcttcactc agtgcttctg 4800 cttgccctgc cccaggcagc ccacccacgc ccagtgcggg tgtggagaag atcttctggc 4860 ttccctgcat cttgcctttg ggattgggat ccaagggttc tccatggatg gatccaagtc 4920 atagagggga atgtttgaga cagggaaggg ggctgtgatc cagaggctca gaataaaaag 4980 atgccctccc ttctatgcag gggggcaagt ttactggatg gagatgattt gggcctctct 5040 tccagaagaa gctaaaggaa gagaagggga gtgagagttc agggaggccc ttcccaccct 5100 gtgaggcttg acttgatctg gattggggat gacaggaatc tcaccctctg gggtgctggc 5160 aaggaggtct ttgcacagga aaaggggtag ctcatttcag tttgtttttt ctttaaattg 5220 aatcctcaag tcattttctg ttcacctgcc gcacagggac aagcttgact tctattttct 5280 gtgtagtgaa aacaatgtca tttatttggt ttttcacctc agccctctca taggagcata 5340 gaatgttagg gtctttactc cctaatgatg tctgattggc acatcaagag ttaactctgc 5400 cttctgggcc aaattcgaaa taaccagtcc atttttcctt tttttttttt tttttttttt 5460 ttaaatggtg gaatgtctct cagcacagtt gcggcttcct caaaccctga aagcatctgt 5520 gtttattata ctcgggtgtc actcactgtt gatgtctgca cctacgtttc cacctcctcc 5580 ccctccttca gccagcctat gataacacta aagattatta atgttggttt tgtatctcgt 5640 taaagacaga attgtcactt gtagtatttc tgtagcattc agcgctgctg tggctaacac 5700 cactgtgtat gtttcatcat tgctctgaag gtcaaaagcc tcattttatt ttgctggttt 5760 gatttttttt ttttaaagaa gaaaaaaaaa ctgccctgaa ttaaatggct gttttaacag 5820 taggctctta gcattatacc acatagtcat ttttcatgtt cttgtttaac aggcactgag 5880 gttctggttt aaattaaata gctgcaaatg agacaattta taacccatta ggttgggtgg 5940 aaaattgttt ctcaaaagca aataagtaat aaatctggta tctgcctata actcacagtt 6000 gataagaaag tggccatttc tcactagcac tatatatgat ttgggctctg ggtaatttgg 6060 aagtgttagg tttgtgtctt tgtagcagta tttttattag aaaagaatct attggccttt 6120 tacagggtat taatcccttt gtcacctacc attgatgcct taagttttct gagtctcaat 6180 taaaaatctt ccttttcttg atgcatgaca agtgtaatca gtacttgctc atttatttgt 6240 ctgtatttag tttatgctgt actatttaat tatccttcca gcgttttttt tttctcctta 6300 caaatatgat actctttagt gttaagctaa ggcattgatt catgtatctg tccttataat 6360 gaattaataa actattttcc ag 6382 38 1369 DNA Homo sapiens misc_feature Incyte ID No 7493736CB1 38 agctgcataa ggactgcccg gggccgcgcg ccgggaacct cgcggggctg gcgggcgccg 60 caccccctcc ctggccgcct gcgccccggg gaggctgccc gcgcgcgacg ggaccggcag 120 catgagcagc ggctacagca gcctggagga ggacgccgag gacttcttct tcaccgccag 180 gacctccttc ttcaggagag cgccccaggg caagccccgc tccggccaac aagatgttga 240 gaaagagaag gaaacccaca gttacctcag caaagaggag atcaaagaga aagttcataa 300 atacaactta gcagtcacag acaagttgaa gatgaccttg aattcaaatg ggatttacac 360 tggcttcatt aaagtacaga tggaactctg caaacctcca cagacttctc caaattctgg 420 aaaactctct cccagtagca atggctgtat gaatacactt catatcagca gcacaaacac 480 tgtcggggaa gtgatcgagg ccctgctcaa aaagtttctc gtgactgaga gccctgccaa 540 gtttgcactt tataagcgtt gtcacaggga agaccaagtc tacgcctgca agctctcaga 600 ccgggaacat ccactctacc tgcgtttggt agcagggccc agaacagaca cacttagttt 660 tgttcttcgt gaacatgaaa ttggagagtg ggaagccttc agccttccag aactacagaa 720 tttcttgcgc atcttggaca aggaagaaga tgaacagctg cagaacctga agaggcgcta 780 cacagcctac aggcagaagc tggaagaagc cctccgtgag gtgtggaagc ctgattaaag 840 cggggctccc tgcccgtgag gcccggtgca ggaccgatgt acaaaacagc agcaagtgcc 900 tctcttctca gagggctgct gtcttgcccc actatgctag ggtcttcgcc tttctatctg 960 tagattttgt tccccaaacc tggtcacaca gcatcctgca ccttagcgtc cccatgtaag 1020 ggactctgca agcttgttgt tcagcaccgc cagtgttacc tcttggccag ctgtgaacct 1080 gtcgcctcat caggagcatt cgagggtgct tggaagcatg aactctgggg gtgttttttt 1140 ttttggttat tttaattatg tgatgatgct gttaagctta cagatatgca gttgattttt 1200 taaaaagctt aactaggaat ctttctgaat acttgtccta ttttgaaact acctgtctgg 1260 ttttttgttt ggttttgttt tcttttttta atgtcagagg aatctatcca tgtgaatcga 1320 aaggcacagg aatctaggaa ctctgaggaa tttattgtga aggaagcgg 1369 39 3730 DNA Homo sapiens misc_feature Incyte ID No 4614878CB1 39 gggcaacgca gggagcatgg attcgcagca gaccgatttc agggcgcaca acgtgccttt 60 gaagctgccg atgccagagc caggtgaact ggaggagcga tttgccatcg tgctgaatgc 120 tatgaaccta cctcctgaca aagccaggtt actgcggcag tatgataatg agaaaaaatg 180 ggaactgatt tgtgatcagg aacgattcca ggtgaagaat cctccccata catacattca 240 aaagctcaaa ggctatctgg atccagctgt aaccaggaag aaattcagac ggcgtgttca 300 agaatctaca caagtgctaa gagaactgga aatttctttg agaactaacc acattggatg 360 ggtcagagaa tttctgaatg aagaaaacaa aggtcttgat gttctagtgg aatatctctc 420 atttgcacag tacgcggtaa cttttgactt tgaaagtgtg gagagtactg tggagagctc 480 ggtggacaaa tcaaagccct ggagtaggtc catcgaggac ctgcacagag ggagcaacct 540 gccctcacct gtgggcaaca gtgtctcccg ctctggaaga cattctgcac tgcgatataa 600 tacattgcca agcagaagaa ctctgaaaaa ttcaagatta gtgagtaaga aagatgatgt 660 gcatgtctgt atcatgtgtt tacgtgccat catgaattat cagtatggtt tcaacatggt 720 catgtctcat ccacacgctg tcaatgagat tgcactaagc ctgaacaaca agaatcccag 780 aacaaaagcc cttgtcttag aactgttggc agccgtttgt cttgtcagag gcgggcatga 840 aatcatttta tcagcatttg ataactttaa agaggtttgt ggagaaaaac agcgctttga 900 gaagttgatg gaacatttca ggaatgaaga caataacata gattttatgg tggcttctat 960 gcagtttatt aatattgtag tccattcagt agaagatatg aatttcagag ttcacctgca 1020 gtatgaattt accaaattag gcctggacga atacttggac aagctgaaac acactgagag 1080 tgacaagctt caagtccaga tccaggctta cctggacaat gtttttgatg taggagctct 1140 actggaagat gctgaaacta agaatgctgc cttggagagg gtggaagaac tggaagaaaa 1200 catttctcat ttatctgaaa aactgcaaga cacagagaat gaagccatgt ccaagattgt 1260 ggaactggaa aagcaactca tgcagaggaa caaggagctg gatgtcgttc gggaaatcta 1320 caaagatgca aatactcaag ttcacacatt aagaaaaatg gtcaaagaaa aagaagaagc 1380 aattcaaaga cagtctaccc tggaaaaaaa gattcatgag ctagagaaac aagggaccat 1440 taaaattcag aagaaagggg atggggatat cgccatactg ccagttgtgg cttctggcac 1500 attgtccatg gggtcagaag tggtagcagg taactctgtg ggacccacaa tgggggccgc 1560 ttcctcagga cccttgcccc ctcctccacc accactgcct ccctcatcag acacacctga 1620 aacagtgcaa aatggtccag taacaccacc tatgccaccg ccgccgccgc caccccctcc 1680 tccacctcct cctcccccac cgccccctcc gcctcctcct ctcccaggcc ctgcagctga 1740 gactgtacca gctcctccct tagcacctcc ccttccctct gcacctccgc tgcctggaac 1800 atcttcaccc acagtggttt tcaactcagg attagcagct gtgaaaatta agaagccaat 1860 caagacgaag ttcagaatgc cagtgtttaa ctgggttgct ctgaagccca atcagatcaa 1920 tggcacagtc ttcaatgaaa ttgatgatga gcgaattctg gaggatttaa atgtggatga 1980 atttgaggaa atattcaaga caaaagccca aggacctgcc attgatcttt cttcaagcaa 2040 acagaagata ccacagaagg gatcaaacaa agtgacatta ctagaagcaa acagggccaa 2100 aaatcttgcc ataactttaa ggaaagctgg aaagactgct gatgaaatat gtaaagctat 2160 tcatgtattt gacttgaaga cactgcctgt ggactttgtg gaatgcttga tgcggttcct 2220 accaactgag aatgaagtga aagtgcttcg gctctacgag cgggaaagga agcctctgga 2280 aaacttgtca gatgaagatc ggttcatgat gcagtttagt aaaatcgaga ggctcatgca 2340 gaagatgacc atcatggcct tcattgggaa ctttgctgaa agcattcaga tgctgactcc 2400 tcaactacat gcgattatag cagcatctgt ctctataaag tcgtcccaaa aactcaagaa 2460 aattctggag atcatcttag cccttggaaa ctacatgaat agcagtaaaa gaggagcagt 2520 ttatggattt aaacttcaga gtttagatct gctcttagat acaaagtcaa cagacagaaa 2580 gcaaacactg ttgcactata tatccaatgt ggtgaaagaa aaatatcacc aagtgtccct 2640 gttttataat gagcttcatt atgtggaaaa agctgctgca gtctcccttg agaatgtttt 2700 gctggatgtc aaggagctcc agaggggaat ggacttgacc aagagagagt acaccatgca 2760 tgaccataac acgctgctga aggagttcat cctcaacaat gaggggaagc tgaagaagct 2820 gcaggatgat gccaagatcg cacaggatgc ctttgatgat gttgtgaagt attttggaga 2880 aaaccccaag acaacaccac cctctgtctt ctttcctgtc tttgtccggt ttgtgaaagc 2940 atataagcaa gcagaagagg aaaatgagct gaggaaaaag caggaacaag ctctcatgga 3000 aaaactccta gagcaagaag ctctgatgga gcagcaggat ccaaagtctc cttctcataa 3060 atcaaagagg cagcagcaag agttaattgc agaattaaga agacgacaag ttaaagataa 3120 cagacatgta tatgagggaa aagatggtgc cattgaagat attatcacag ccttaaagaa 3180 gaataatatc actaaatttc caaatgttca ctcgagggta aggatttctt ctagcacacc 3240 ggtggtggag gatacacaga gctgatctta gaaaccaacc atacagacga gccgatgcgg 3300 tgaggagaag cgtcaggcgg cgctttgatg atcagaactt gcgttctgtt aatggtgccg 3360 aaataacaat gtgaacctga gactggcctg catgaataca gggtgtgcgt gaatgaaact 3420 gcccacatga actttatgtg ctacgattta actgcagcct tgaacacaca caaaaatatt 3480 cttaagggct cagatttagc aaacacggaa gaattttaaa atgagctctc ctttcaaccc 3540 ttgttaacaa gtgcctaaaa atggaagtac ctgttcagat taatcaaagc aataggattt 3600 gatttgatta ggtatctttt tacaccagta tgttattttt aaccaaaatg taaagttctt 3660 attaaactca ttacctgcca ttgtgattgt cccatcatgg cccacctggt ttcctgatgt 3720 tgtaaataac 3730 40 2740 DNA Homo sapiens misc_feature Incyte ID No 7498437CB1 40 cgctaactcg gctacggtgt atctgcgtct ttggtcaggt tgttccttgg ctaagagggc 60 agtcgtcgcg gacccacgcg gttagcaagg cttagtgctc gggccggccg ccttcacttc 120 cctcccggct tttcctcccg acttatccac tttaggggcg tctcggagtg ccggaggccc 180 ccggggaaga gcggggtgcc ggtgtccgct ccgggctcgg atgggaagtg gtgggaggag 240 cgacccggga tgttcagtct gatggccatt tgctgcggct ggttcaagcg gcggcgggag 300 cctgtcagaa aggtgactct tttgatggtg ggacttgata atgctggtaa aaccgcaaca 360 gcaaagggaa tccaaggaga ataccctgaa gatgtagctc ctactgttgg attttcaaaa 420 attaacctta gacaaggaaa gtttgaagtc accatctttg acttgggagg tggaataaga 480 attcggggaa tctggaagaa ttactatgct gaatcctatg gggtaatatt tgttgtggat 540 tccagtgatg aagagagaat ggaagagaca aaagaggcta tgtcagaaat gctaagacat 600 cctaggatat cgggaaagcc tatattggtg ttggcaaata aacaagataa agaaggagct 660 ttaggagaag ctgatgtcat tgaatgtcta tctctggaaa aattggtcaa tgagcacaag 720 tgcctgtgtc agatagaacc atgttcagca atctcggggt atggaaagaa aattgacaag 780 tccattaaaa aaggccttta ttggctgcta catgttattg caagagactt tgatgcctta 840 aatgaacgca tccaaaaaga gacaacagag cagcgtgctc ttgaggaaca agagaaacaa 900 gaaagagctg aacgagtgcg aaaattacga gaagaaagaa aacaaaatga acaggagcag 960 gctgaactcg atggaaccag tggtctggct gagttggacc cagaaccaac gaatcctttc 1020 cagccaatag catctgtaat cattgagaat gaaggaaaac ttgaaagaga gaaaaaaaac 1080 caaaaaatgg agaaagacag tgatggctgc cacctgaaac ataaaatgga gcatgagcaa 1140 atagagacac aaggccaggt taatcacaat ggccaaaaaa ataatgaatt tggactagta 1200 gaaaattata aggaggcatt aacacagcag ttaaagaatg aagatgagac agaccggcca 1260 tcattggaat cagctaatgg taaaaagaaa actaagaaac taagaatgaa aaggaaccac 1320 cgggtagaac cacttaatat agatgactgt gctcctgaga gtccaacgcc acccccaccc 1380 cctcctcctg ttggctgggg aacccctaaa gtcactagac ttccaaaact tgagcctctt 1440 ggtgaaacac atcataatga tttctatagg aagccactgc ctcccctggc tgtgccacag 1500 cgacctaaca gtgatgctca tgatgtgatc tcataaacaa gacgtatgga ggagttctct 1560 taatatcagc aaggtgaact gggacattct tctttctcag aagaagaaac atcttgtaaa 1620 ttgatgactg gggcaagata accataataa ttttagtgag aagattaata ctcaaggacc 1680 tgacttgata attacttatt tgtgtttttc atggttaaaa caaaaaaaga agcacaatga 1740 ccagtacatg aaatcagcat ttggaccaaa ttagcaagat ttactgttga ctctggttta 1800 tacatcccca ctcatgagca tacttctgaa ggaaaacttt acaaaaagag ccaatggact 1860 cagcactttc tttactattt gttcaatagc ctatatttct agatattaaa tattttttgt 1920 aagatatatg cacatagaag ggggacttct aggaatttat aaccaaaatt ttaaaactat 1980 ttttatattt ttttaaatct ttaaaaattt taattactgt gatattgatt atacaccttt 2040 tttatagatg atttgtttaa attatatctg gaaaatagag ttgatttact ttcagacatg 2100 aactatacaa acaggatata tttatatggc ttgaggtatg ttttgtaata gcgttgttct 2160 ttaagtgtac atcgtaattt gacctaattt aaaaataata tattttcgaa gtagtagatt 2220 ctcagatctt tattctaaca attacatgat ttgaaaacag tactcaatga gactggaaat 2280 accttttgta cctaaatgtt ttttaaatta atcaaaatac attctgtgag atactattaa 2340 aagtctcagt aaagtcctat tttaaaggtt agacactttc agagatcaag gtgctcccta 2400 tactcccttg ttccatcaga agctgcagtg actcttttag gtgattctaa ttctttcatg 2460 ccttgaaatt aaactatgaa attaaaggca gaaaaactgg aaaacctact gcaggtccag 2520 agacttctgt tttacaactg ggaaacagct tgtgtactaa agtaatgaga ataaaaatgt 2580 tggcccaaaa ttacttttat aaataaataa ttccaaaata atttcttaat ttaaaatgat 2640 agcacttctc ttgcacttaa gaatggcatc cataagattc tgtatttggc atttagacta 2700 acttctagat gtatattttt aataatagat tcactaatgt 2740 41 2087 DNA Homo sapiens misc_feature Incyte ID No 3097848CB1 41 gcaagaggtg agtgaggcaa ctgaaaactg ttcttggacc tgcggtgcta tagagcaggc 60 tcttctaggt tggcagttgc catggaatct ggacccaaaa tgttggcccc cgtttgcctg 120 gtggaaaata acaatgagca gctattggtg aaccagcaag ctatacagat tcttgaaaag 180 atttctcagc cagtggtggt ggtggccatt gtaggactgt accgtacagg gaaatcctac 240 ttgatgaacc atctggcagg acagaatcat ggcttccctc tgggctccac ggtgcagtct 300 gaaaccaagg gcatctggat gtggtgcgtg ccccacccat ccaagccaaa ccacaccctg 360 gtccttctgg acaccgaagg tctgggcgat gtggaaaagg gtgaccctaa gaatgactcc 420 tggatctttg ccctggctgt gctcctgtgc agcacctttg tctacaacag catgagcacc 480 atcaaccacc aggccctgga gcagctgcat tatgtgacgg agctcacaga actaattaag 540 gcaaagtcct ccccaaggcc tgatggagca gaagattcca cagagtttgt gagtttcttc 600 ccagactttc tttggacagt acgggatttc actctggagc tgaagttgaa cggtcaccct 660 atcacagaag atgaatacct ggagaatgcc ttgaagctga ttcaaggcaa taatcccaga 720 gttcaaacat ccaattttcc cagggagtgc atcaggcgtt tctttccaaa acggaagtgt 780 ttcgtctttg accggccaac aaatgacaaa gaccttctag ccaatattga gaaggtgtca 840 gaaaagcaac tggatcccaa attccaggaa caaacaaaca ttttctgttc ttacatcttc 900 actcatgcaa gaaccaagac cctcagggag ggaatcacag tcactgggaa tcgtctggga 960 actctggcag tgacttatgt agaggccatc aacagtggag cagtgccttg tctggagaat 1020 gcagtgataa ctctggccca gcgtgagaac tcagcggccg tgcagagggc atctgactac 1080 tacagccagc agatggccca gcgagtgaag ctccccacag acacgctcca ggagctgctg 1140 gacatgcatg cggcctgtga gagggaagcc attgcaatct tcatggagca ctccttcaag 1200 gatgaaaatc aggaattcca gaagaagttc atggaaacca caatgaataa gaagggggat 1260 ttcttgctgc agaatgaaga gtcatctgtt caatactgcc aggctaaact caatgagctc 1320 tcaaagggac taatggaaag tatctcagca ggaagtttct ctgttcctgg agggcacaag 1380 ctctacatgg aaacaaagga aaggattgaa caggactatt ggcaagttcc caggaaagga 1440 gtaaaggcaa aagaggtctt ccagaggttc ctggagtcac agatggtgat agaggaatcc 1500 atcttgcagt cagataaagc cctcactgat agagagaagg cagtagcagt ggatcgggcc 1560 aagaaggagg cagctgagaa ggaacaggaa cttttaaaac agaaattaca ggagcagcag 1620 caacagatgg aggctcaagt taagagtcgc aaggaaaaca tagcccaact gaaggagaag 1680 ctgcagatgg agagagaaca cctactgaga gagcagatta tgatgttgga gcacacgcag 1740 aaggtccaaa atgattggct tcatgaagga tttaagaaga agtatgagga gatgaatgca 1800 gagataagtc aatttaaacg tatgattgat actacaaaaa atgatgatac tccctggatt 1860 gcacgaacct tggacaacct tgccgatgag ctaactgcaa tattgtctgc tcctgctaaa 1920 ttaattggtc atggtgtcaa aggtgtgagc tcactcttta aaaagcataa gctccccttt 1980 taaggatatt atagattgta catatatgct ttggactatt tttgatctgt atgtttttca 2040 ttttcattca gcctggccaa catggcggaa ccctgccttt actgaaa 2087 42 7034 DNA Homo sapiens misc_feature Incyte ID No 2957789CB1 42 atcactcgtc ccgcttcctc tggatctctc taggagctga cactcgaacc ttcacgaccc 60 attcggattt ttccaggact caggagtggc actgggaaga aggggaccgc tttctgcaat 120 tggcctcgac actggctgcc aagaagacct gtcgcctttg tttttaagtc tccagaaatg 180 gaagaagaag gcgaagtcag ttgaagtcac gagaaatcag cgaggcattt gaaggcgctt 240 cctgaaaacc tttattccct ggcacgttgt ctgtttcaac agccctgccc ctcctcggag 300 cctgcttgtg gaattcttcc ccttcgggtg tgtggtggca ttccccgcca cgtccaatgt 360 ggactccaaa ggatttgtcc cttctttgtc atttgaaatg aaatgatggc cacgcgtcgg 420 actggtctgt ctgagggaga tggtgacaag ctcaaggcct gcgaggtctc aaaaaataaa 480 gatggaaaag aacaaagtga aactgtatca ctgtctgaag atgaaacatt ctcctggcca 540 ggtcccaaaa cagttacgtt gaaaagaaca tctcaaggct ttggttttac attaagacat 600 tttattgttt atcccccaga gtctgcaatt caattttcat ataaggatga agaaaatgga 660 aacagaggag gaaaacaaag aaaccgcttg gaaccaatgg ataccatatt tgttaagcaa 720 gttaaagaag gaggacctgc ttttgaagct ggattatgta caggtgaccg aattataaaa 780 gtcaatggag aaagtgttat tggcaaaacc tattcccaag taattgcttt aattcaaaac 840 agtgatacaa cattggaact tagtgttatg ccaaaagatg aagacattct ccaagtgcta 900 cagtttacaa aggatgtcac agcactggca tattctcaag atgcctacct gaaaggcaac 960 gaagcttata gcggcaatgc ccgcaatata cctgaacctc caccaatctg ctatccctgg 1020 ctgccatctg ccccatcagc catggcacag ccagttgaaa tatctcctcc tgactcatca 1080 ttgagcaaac agcaaaccag tacaccagta ctgacacaac ctggtagggc ctatagaatg 1140 gaaatacaag tgcctccatc accaacagat gttgcaaaat caaacacagc agtgtgtgtt 1200 tgcaatgaaa gtgtaaggac tgtcattgtg ccttctgaga aggttgtaga tttgttatcc 1260 aatagaaaca accatacagg tccttcacat agaactgaag aagtgaggta tggcgtgagt 1320 gagcagacct ctttaaaaac agtgtcaaga accacatcac caccattatc aattcccacc 1380 actcatctaa ttcatcagcc tgcaggctcc agatcactgg aaccttctgg aattttactt 1440 aagtctggaa attacagtgg acattctgat ggaatctcaa gcagcagatc tcaagctgtg 1500 gaggctccct ctgtatctgt taatcactat tcgccaaatt cccatcagca catagactgg 1560 aaaaactata aaacttacaa agagtatatt gataacagac gattgcacat aggttgtcgg 1620 acaatacaag aaagattaga tagtttaaga gcagcatctc aaagcacgac agattataac 1680 caggtcgtcc ccaaccgcac tactttgcag ggacgacgtc gaagcacctc tcatgatcga 1740 gtgccccagt ctgtccagat acggcaacgc agtgtgtccc aagaaagact ggaagattct 1800 gtgctaatga agtattgtcc aagaagtgca tctcaaggag cactgacgtc tccatctgtt 1860 agttttagta atcatagaac tcgttcatgg gattatattg agggacagga tgaaacctta 1920 gaaaatgtca attctggaac tccaatacct gattccaatg gagagaaaaa acagacttac 1980 aagtggagtg ggtttactga acaggatgat agacgaggta tttgtgaaag acctaggcag 2040 caagaaattc ataaatcttt tcgaggttcc aattttactg tggctccaag cgttgttaat 2100 tctgataaca ggcgaatgag tggtagagga gtgggatctg tgtcgcagtt taaaaaaatt 2160 ccaccagatc taaaaacatt gcagtcaaac agaaattttc agactacttg tggaatgtca 2220 ctgcctcggg gtatttcaca agacaggtca cctcttgtga aagtccgaag taattctctg 2280 aaagctcctt ccacgcatgt cacaaaacca tcatttagcc agaaatcatt tgtttctatc 2340 aaagaccaaa gaccagtaaa tcacttgcat cagaacagtc tgttgaatca gcagacatgg 2400 gtaaggactg acagtgcccc cgatcagcaa gtggagactg ggaaatcccc ctctttatct 2460 ggagcctctg ccaagcctgc ccctcagtcg agtgaaaacg ctggtacttc agatttagaa 2520 ctacctgtca gtcaaaggaa tcaagattta agtttacaag aggctgaaac tgagcaatca 2580 gatactttag ataataaaga agctgtcatc ctaagggaaa aacctccatc tggacgccag 2640 acaccgcagc ctttaaggca tcagtcttac atcttggcag taaatgacca ggagaccggg 2700 tcagacacta cctgctggct gcccaatgat gcacgtcgag aggtccacat aaaaagaatg 2760 gaggaaagaa aagcctcgag taccagtccg cctggcgatt ctttggcttc catcccattt 2820 atagatgaac caactagccc tagcattgat catgatattg cacatatccc tgcctctgct 2880 gttatatcag cctctacctc tcaggtcccc tccatagcaa cagttcctcc ttgcctcaca 2940 acttcagctc cattaattcg ccgtcagctc tcacatgacc acgaatctgt tggccctcct 3000 agcctggatg ctcagcccaa ctcaaagaca gaaagatcaa aatcatatga tgagggtctg 3060 gatgattaca gagaagatgc aaaattgtcc tttaagcacg tatctagtct gaagggaatc 3120 aagatcgcag acagccaaaa gtcatcagaa gactctgggt ccagaaaaga ttcttcctca 3180 gaggtcttca gtgatgctgc caaggaaggg tggcttcatt tccgacccct tgtcaccgat 3240 aagggcaagc gagttggtgg aagtattcgg ccatggaaac agatgtatgt tgtccttcgg 3300 ggtcattcac tttacctgta caaagataaa agagagcaga cgactccgtc tgaggaagag 3360 cagcccatca gtgttaatgc ttgcttgata gacatctctt acagtgagac caagaggaaa 3420 aatgtgtttc gactcaccac gtccgactgt gaatgcctgt ttcaggctga agacagagat 3480 gatatgctag cttggatcaa gacgatccag gagagcagca acctaaacga agaggacact 3540 ggagtcacta acagggatct aattagtcga agaataaaag aatacaacaa tctgatgagc 3600 aaagcagaac agttgccaaa aacacctcgc cagagtctca gcatcaggca aactttgctt 3660 ggtgctaaat cagagccaaa gactcaaagc ccacactctc cgaaggaaga gtcggaaagg 3720 aaacttctca gtaaagatga taccagtccc ccaaaagaca aaggcacatg gagaaaaggc 3780 attccaagta tcatgagaaa gacatttgag aaaaagccaa ctgctacagg aactttcggc 3840 gtccgactag atgactgccc accagctcat actaatcggt atattccatt aatagttgac 3900 atatgttgca aattagttga agaaagaggt cttgaatata caggtattta tagagttcct 3960 ggaaataatg cagccatctc aagtatgcaa gaagaactca acaagggaat ggctgatatt 4020 gatatacaag atgataaatg gcgagatttg aatgtgataa gcagtttact aaaatccttc 4080 ttcagaaaac tccctgagcc tctcttcaca aatgataaat atgctgattt tattgaagcc 4140 aatcgtaaag aagatcctct agatcgtctg aaaacattaa aaagactaat tcacgatttg 4200 cctgaacatc attatgaaac acttaagttc ctttcagctc atctgaagac agtggcagaa 4260 aattcagaaa aaaataagat ggaaccaaga aacctagcaa tagtgtttgg tcccaccctt 4320 gttcgaacat cagaagacaa catgacccac atggtcaccc acatgcctga ccagtacaag 4380 attgtagaaa cgctcatcca gcaccatgac tggtttttca cagaagaagg tgctgaagag 4440 cctcttacaa cagtgcagga ggaaagcaca gtagactccc agccagtgcc aaacatagat 4500 catttactca ccaacattgg aaggacagga gtctccccag gagatgtatc agattcagct 4560 actagtgact caacaaaatc taagggttct tggggatctg gaaaggatca gtatagcagg 4620 gaactgcttg tgtcctccat ctttgcagct gctagtcgca agaggaagaa gccgaaagaa 4680 aaagcacagc ctagcagctc agaagatgaa ctggacaatg tattttttaa gaaagaaaat 4740 gtggaacagt gtcacaatga tactaaagag gagtccaaaa aagaaagtga gacactgggc 4800 agaaaacaga agatcatcat tgccaaagaa aacagcacta ggaaagaccc cagcacgaca 4860 aaagatgaaa agatatcact aggaaaagag agcacgcctt ctgaagaacc ctcaccacca 4920 cacaactcaa aacacaacaa gtcaccaact ctcagctgtc gctttgccat cctgaaagag 4980 agccccaggt cacttctggc acagaagtcc tcccaccttg aagagacagg ctctgactct 5040 ggcactttgc tcagcacgtc ttcccaggcc tccctggcaa ggttttccat gaagaaatca 5100 accagtccag aaacgaaaca tagcgagttt ttggccaacg tcagcaccat cacctcagat 5160 tattccacca catcgtctgc tacatacttg actagcctgg actccagtcg actgagccct 5220 gaggtgcaat ccgtggcaga gagcaagggg gacgaggcag atgacgagag aagcgaactc 5280 atcagtgaag ggcggcctgt ggaaaccgac agcgagagcg agtttcccgt gttccccaca 5340 gccttgactt cagagaggct tttccgagga aaactgcaag aagtgactaa gagcagccgg 5400 agaaattctg aaggaagtga attaagttgc accgagggaa gtttaacatc aagtttagat 5460 agccggagac agctcttcag ttcccataaa ctcatcgaat gtgatactct ttccaggaaa 5520 aaatcagcta gattcaagtc agatagtgga agtctaggag atgccaagaa tgagaaagaa 5580 gcaccttcgt taactaaagt gtttgatgtt atgaaaaaag gaaagtcaac tgggagttta 5640 ctgacaccca ccagaggcga atccgaaaaa caggaaccca catggaaaac gaaaatagca 5700 gatcggttaa aactgagacc cagagcccct gcggatgaca tgtttggagt agggaatcac 5760 aaagtgaatg ccgagactgc taaaaggaaa agcatccggc gcagacatac actaggaggg 5820 cacagagatg ctaccgaaat cagcgttttg aatttttgga aagtgcatga gcagagcggg 5880 gagagagaat ctgaactttc agctgtaaac cggttaaaac caaaatgctc agcccaggac 5940 ctttccatct cagactggct ggccagggaa cgcctacgca ccagtacctc tgaccttagc 6000 agaggagaaa tcggagatcc ccagacagag aacccaagca cacgagaaat agccacgacc 6060 gacacacctt tgtctcttca ttgcaacaca ggcagttctt ccagcacctt ggcttcaaca 6120 aacaggcccc ttctttccat accaccacag tcacctgacc aaataaacgg agaaagcttc 6180 cagaacgtga gcaaaaatgc tagttctgca gcgaatgccc aacctcataa actgtctgaa 6240 accccaggca gtaaagcaga gtttcatccc tgtctttaaa ctgggggtat gtccactcta 6300 gcaagtaaaa aaactactgt tacacgttcc agtaactctg tcaatatttt cttgtatcag 6360 aattgttatt atgcagcctt catttgggct ggtttcatca ttttgcactg tgaaatagct 6420 ttacagtgca ttactacagc cagaagaaca tatatatata tatatattta aaaatatatc 6480 ggatagttgt atacaaatga gcaaggtatt tgttgcaact tactacatag catataccca 6540 aaatcactga agaaaatcgc tggcatcagt gtgcagcaaa tttgttcttt tggtttcatc 6600 actaacaaaa gtgcctcatc ataaaaatac agttggtttt tagggtgcca tattgttaaa 6660 attagataac ttacttacat tgaataaacg aatgcgtttt attggtaaca gatatcatta 6720 catttaccag ttttaacaca ggtggataca gaacttccat tctttagtca ttccaggtgg 6780 atctgagttt tatattcaaa cttttaatac agtttttgag ttttgtgtga cttgaatttt 6840 taatctttct gtaaaatacg taacttaaat gaacatatta aatgtgtatc ttttcttcag 6900 ataccagatt tgatataatg ttgtaacata ggtgtgtaga tagtggatcc tggatggaac 6960 tggcttcttt atcgagaaga atataattct gcatgaggac ttaatgaatc caaacctgtg 7020 tcatgcctgt gtgc 7034 43 305 DNA Homo sapiens misc_feature Incyte ID No 5922849CB1 43 cggggcccgc cctctcgccc ccacctcggg acccagacgt atcctaacct acccccacaa 60 cccccactcc cgcgcgtctt ttaacccctt ccccgccgca accatgtcca acaacatggc 120 caagattgcc gaggcccgca agacggtgga acagctgaag ctggaggtgt cgcaggcagc 180 agcggaactc ctggctttct gcgagacgca tgccaaagat gacccgctgg tgacgccagt 240 acccgccgcg gagaacccct tccgcgacaa gcgcctcttt tgtggtctgc tctgagccct 300 ccgga 305 44 1373 DNA Homo sapiens misc_feature Incyte ID No 7472828CB1 44 ggatcccttg caagtcaggt cccccctgag tgactctgca cacctcctgc ctcccagagc 60 ccagcactgc tgagtcacag gggcctagct ctcccatctg tcttggccag agtcaagggt 120 gcaggggaca ggggatgggc ataggtggaa gggccccttg tttccagact gcctacctca 180 gtggccccgt ggcagcattg ggaatgcctg tgatgagtgg ttcctttgag ttgaagcact 240 ggagggcggg tggtttatta tactggggaa aggtgggcag ggaggaggtg aggccagctt 300 gtctgtccag actccccagg ttggagagaa gatgctccca gcccttcctt cctgcctgcc 360 ccaggacttg gagacacaag gcccaggagg aggacgagga ggaaaataag tatgagctgc 420 ccccctgtga ggctctgccc ctcagtctag cccctgccca ccttcctggc actgaggagg 480 actccttgta cctggatcac tctggccccc tgggtccatc aaagccatcg ccacccctgc 540 ctcagcccac catgctgaag ggagcagtga gcctgccggt ggccggaaag cagggaccta 600 tctttgggag gcgagagcag ggtgcatcgt ccagagtggt gccaggccct ccaaagaaac 660 ctgatgagga cctctacttg gaatgtgagc cggatccagt cctggctttg actcagactc 720 tcagcttcca agtcctgatg ccctcaggcc ctctgcccag gacatcagtg gtgcccaggc 780 ctaccacagc cccccaggaa actcggaatg gaacagcaga tgctgcctct aaagaaggaa 840 ggaaatcgtc tcttccctct gtagccccca ctgggagtgc ctcagctgct gaggatgggg 900 cctataccgt gcgccccagc tcagggcctc atggctccca gcccttcacc ctggcagtgc 960 ttctccgagg ccgggtcttc aacattccca tccggcggct ggatggcgga cgccactatg 1020 ccctgggccg ggagggcagg aaccgtgagg agctcttctc ctccgtggcg gccatggtcc 1080 agcacttcat gtggcaccct ctgccccttg tggacagaca cagcggcagc cgggaactca 1140 cctgcctgct cttccccacc aagccttgag gccacagcga agaatgcagg tgtctgccca 1200 gttcactagg tcctggatga aggaaccgtg gtggcctaga ccagtcaggg gacagcacag 1260 gcactgctgg aacagcaaag gatcctctca catctacttg tgggcctagg cagcctgaga 1320 gggactggcc taccttgcac aagttcacat tcaataaaca tttgttgaat gaa 1373 45 3108 DNA Homo sapiens misc_feature Incyte ID No 8088595CB1 45 cagatgttct ctgttttgga ggggatatcc taaatatcac aggtattttt ttagtttttc 60 tgaatactta aaagactgag tgaggaggac tatatttagg tggaattcag catattggtt 120 tattcagcat tgtttgatta ataggtactg gaccctgggg atacaaagat gaggaagaca 180 tagtctgtgc ctcaaaggat cccccagtct aatggaaaag ctttgcccaa attaaactct 240 cacatgcttt gcatttttgt ttagattgaa ggacagcaaa ataagtgcca ataaagcgtt 300 cttgagaaaa ctaatgagaa atttggtcat taatagcaga agctaagcat gagatgggga 360 ggagttaagc atgtcctatc tggagaactc aggatcacta acagctgtat gccccggagt 420 tggcctgtct tcttatttaa acataataaa accagccatc aaattagata tgaacttgaa 480 cccacatttt atgaatcttt tgctagtacc tctcctcctc tgttagtata ctggtggaag 540 gaagattacc aaaaatatcc ttaggatcat ttaaacttct acctagttac gtttttactg 600 aatgattttg aggatctgaa gcagtatttt aacctagcat aatgtataca gccaagaaac 660 cttgcctgat aacttacttg atatccacac tgggtcaagt tgatcttctg cattactggg 720 actttgccct ggaccaagta agtctagata tttggtcccc atcagcctat cctgtgtgtc 780 tgcagggaac atactcctgg cactctggag agtagaacaa aatcaactga gtgacatcat 840 taccttggtg gcaaaatgca gtctggagat acaggagaaa tttgggatct atcataccag 900 ggaaggccag gacctgcagt taaagatagg tctacctgca agtcggattt cccaggttat 960 tgttggagat gagcggcagc aatacctctt ggtgattggg caggttgtag tgatgtccag 1020 ttagctcagc gtttggctca ggcgaatgaa attgtcctat cctggaactg ctggatgctt 1080 tgcaagcagt atatgtttga agtggcaatc atgagggagg atgaagctgt gaagattgat 1140 gaaggccagc ctatagagta tgtatctgaa ttccgtacga tgactctggt tttggtcagc 1200 ctagagttcc acaggacagc gtggatgttg catttgtgtc atcttatcca ggaggctgcc 1260 ttatacatct ccacagtcat tgagaaaggg ggcggccagc tgagtcggat ctttatgttt 1320 gagaaaggct gcatgttcct ctgtgttttc ggccttcctg gtgataagaa gccagacgag 1380 tgtgcacatg ccctggagag ctccttcagc atcttcagct tctgctggga gaatcttgct 1440 aagaccaact gaggaggaag gtggggcaga gaggagcttc tcaggcccca ggggctcttc 1500 aggcaggatc cctagatttg tttccatcag tatcactaat ggaccagtat tctgtggcgt 1560 ggttggagca gtagcaagac acgaatatac agttattggc ccaaaagtga gtcttgcggc 1620 cagaatgata actgcttatc caggtttggt gtcctgtgat gaggtaacat atctaagatc 1680 catgctacct gcttacaact tcaagaaact cccagagaaa atgatgaaaa acatctccaa 1740 cccagggaag atatatgaat atcttggcca cagaagatgt ataatgtttg gaaaaagaca 1800 tttggcaaga aagagaaaca aaaatcaccc tttgttagga gtgttaggtg ctccctgtct 1860 ctctacagac tgggagaaag aattggaagc cttccaaatg gcacagcaag ggtgtttgca 1920 ccagaagaag ggacaagcag ttctgtatga aggtggaaaa ggctatggaa aaagccagct 1980 gttggctgaa ataaactttc tggcacagaa agaagggcat agctaccctt cacaggtgct 2040 ttggaaaccc actttattgt gaggtcctat gccaggacct tctctctaag gacgtgttgc 2100 tctttcatgt cctacaaaag gaggaagagg aaaacagcaa gtgggaaacc ctctcagcca 2160 atgccatgaa atccataatg tatagtattt ctcctgccaa ctctgaggaa ggccaggaac 2220 tttatgtctg cacagtcaag gatgatgtga acttggatac agtacttctc ctaccctttt 2280 tgaaagaaat agcagtaagc caactggatc aactgagccc agaggaacag ttgctggtca 2340 agtgtgctgc aatcattggt cactccttcc atatagattt gctgcagcac ctcctgcctg 2400 gctgggataa aaataagcta cttcaggtct tgagagctct tgtggatata catgtgctct 2460 gctggtctga caagagccaa gagcttcctg ctgagcccat attaatgcct tcctctatcg 2520 acatcattga tggaaccaaa gagaagaaga caaagttaga acagaggaag agttcctaga 2580 tcaagtgaag aggaagctgg ctcagaccag ccctgagaaa gacctgttga ccacaaagcc 2640 ttgtcactgt aaggatatcc tgaagttagt gctcttaccc ctcacccagc attgcttggt 2700 cgttggagaa accacctgtg cattttatta cctgctggag gctgcggctg cctgcttgga 2760 cctgtcagat aattatatgg tctgtttcaa catgggacgt atcactttag ccaaaaaatt 2820 ggctaggaaa gcccttcgac tgctgaaaag gaatttccct tggacctggt ttggtgtcct 2880 tttccagaca ttcctggaaa agtattggca ttcctgtacc ctgagccaac ctccaaacga 2940 ccctagtgag aagtgagaag tcttcctaaa actgtagtta actagcctga gctttgcctt 3000 tttgacctaa aactactctt tttctatcaa gtaatcttca agcatctaac agacaagcag 3060 ataacaagac atgtaacagt cagcatacat atatatatgc atgtaaca 3108 46 1265 DNA Homo sapiens misc_feature Incyte ID No 7488478CB1 46 gagagcccca gctgctccag gttggggccg tgggggctcc ctaaggaggg gacaccaggg 60 ccccaatgtt tcacagaggg gcggggggca gcgaccctgg gcagcatggc ctcaggctga 120 agcccgatct ggcatttcct tgtgctcata atgagaatag ttctccaatt agccaagatg 180 aacctcatgg acatcaccaa gatcttctcc ctcctgcagc ccgacaagga ggaggaggac 240 actgacacag aggagaagca ggctctcaat caagcagtgt atgacaacga ctcctatact 300 ttggaccagc ttttgcgcca ggagcgttac aaacgtttca tcaacagcag gagtggctgg 360 ggtgttcctg ggacaccctt gcgcttggct gcttcttatg gccacttgag ctgtttgcaa 420 gtcctcttag cccatggtgc tgatgttgac agcttggatg tcaaggcaca gacgccactt 480 ttcactgctg tcagtcatgg ccatctggac tgtgtacgtg tgcttttgga agctggtgcc 540 tctcctggtg gtagcatcta caacaactgt tctcccgtgc tcacagctgc ccgtgatggt 600 gctgttgcta tcctgcagga gctcctagac catggtgcag aggccaacgt caaagctaaa 660 ctaccagtct gggcatcaaa catagcttca tgttctggcc ccctctattt ggccgcagtc 720 tacgggcacc tggactgttt ccgcctgctt ttgctccacg gggcagaccc tgactacaac 780 tgcactgacc agggcctatt ggctcgtgtc ccaagacccc gcaccctcct tgaaatctgc 840 ctccatcata attgtgagcc agagtatatc cagctgttaa tcgattttgg tgctaatatc 900 taccttccat ctctctccct tgacctgacc tcacaagatg ataaaggcat tgcattgctg 960 ctacaggccc gagccactcc acggtcactt ctatcacagg tccgtttagt cgtccgcaga 1020 gccttgtgcc aggctggcca gccacaagcc atcaaccagc tggatattcc tcccatgttg 1080 attagctacc taaaacacca actgtaatct tgcagtctcc ccaggaactt atgatgcctc 1140 cgaaaaccac ctggggactc acgtagctgg agagcattac agcctcatcc acttacctgg 1200 agctgctctc ctgtattatc ctccacaata aaattctcca gaaaataaaa aaaaaaaaaa 1260 aaaaa 1265 

1. An isolated polypeptide selected from the group consisting of: a) a polypeptide comprising an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical to an amino acid sequence selected from the group consisting of SEQ ID NO:1-2, SEQ ID NO:4, SEQ ID NO:6-10, SEQ ID NO:13-14, and SEQ ID NO:16-23, c) a polypeptide comprising a naturally occurring amino acid sequence at least 91% identical to the amino acid sequence of SEQ ID NO: 11, d) a polypeptide comprising a naturally occurring amino acid sequence at least 93% identical to the amino acid sequence of SEQ ID NO:15, e) a polypeptide comprising a naturally occurring amino acid sequence at least 94% identical to the amino acid sequence of SEQ ID NO:3, f) a polypeptide comprising a naturally occurring amino acid sequence at least 96% identical to the amino acid sequence of SEQ ID NO:5, g) a biologically active fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23, and h) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ ID NO:1-23.
 2. An isolated polypeptide of claim 1 comprising an amino acid sequence selected from the group consisting of SEQ ID NO:1-23.
 3. An isolated polynucleotide encoding a polypeptide of claim
 1. 4. An isolated polynucleotide encoding a polypeptide of claim
 2. 5. An isolated polynucleotide of claim 4 comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46.
 6. A recombinant polynucleotide comprising a promoter sequence operably linked to a polynucleotide of claim
 3. 7. A cell transformed with a recombinant polynucleotide of claim
 6. 8. (Canceled)
 9. A method of producing a polypeptide of claim 1, the method comprising: a) culturing a cell under conditions suitable for expression of the polypeptide, wherein said cell is transformed with a recombinant polynucleotide, and said recombinant polynucleotide comprises a promoter sequence operably linked to a polynucleotide encoding the polypeptide of claim 1, and b) recovering the polypeptide so expressed.
 10. (Canceled)
 11. An isolated antibody which specifically binds to a polypeptide of claim
 1. 12. An isolated polynucleotide selected from the group consisting of: a) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID NO:24-46, c) a polynucleotide complementary to a polynucleotide of a), d) a polynucleotide complementary to a polynucleotide of b), and e) an RNA equivalent of a)-d).
 13. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a polynucleotide of claim
 12. 14. A method of detecting a target polynucleotide in a sample, said target polynucleotide having a sequence of a polynucleotide of claim 12, the method comprising: a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides comprising a sequence complementary to said target polynucleotide in the sample, and which probe specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization complex is formed between said probe and said target polynucleotide or fragments thereof, and b) detecting the presence or absence of said hybridization complex, and, optionally, if present, the amount thereof.
 15. A method of claim 14, wherein the probe comprises at least 60 contiguous nucleotides.
 16. A method of detecting a target polynucleotide in a sample, said target polynucleotide having a sequence of a polynucleotide of claim 12, the method comprising: a) amplifying said target polynucleotide or fragment thereof using polymerase chain reaction amplification, and b) detecting the presence or absence of said amplified target polynucleotide or fragment thereof, and, optionally, if present, the amount thereof.
 17. A composition comprising a polypeptide of claim 1 and a pharmaceutically acceptable excipient. 18-19. (Canceled)
 20. A method of screening a compound for effectiveness as an agonist of a polypeptide of claim 1, the method comprising: a) exposing a sample comprising a polypeptide of claim 1 to a compound, and b) detecting agonist activity in the sample. 21-22. (Canceled)
 23. A method of screening a compound for effectiveness as an antagonist of a polypeptide of claim 1, the method comprising: a) exposing a sample comprising a polypeptide of claim 1 to a compound, and b) detecting antagonist activity in the sample. 24-26. (Canceled)
 27. A method of screening for a compound that modulates the activity of the polypeptide of claim 1, the method comprising: a) combining the polypeptide of claim 1 with at least one test compound under conditions permissive for the activity of the polypeptide of claim 1, b) assessing the activity of the polypeptide of claim 1 in the presence of the test compound, and c) comparing the activity of the polypeptide of claim 1 in the presence of the test compound with the activity of the polypeptide of claim 1 in the absence of the test compound, wherein a change in the activity of the polypeptide of claim 1 in the presence of the test compound is indicative of a compound that modulates the activity of the polypeptide of claim
 1. 28. (Canceled)
 29. A method of assessing toxicity of a test compound, the method comprising: a) treating a biological sample containing nucleic acids with the test compound, b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 contiguous nucleotides of a polynucleotide of claim 12 under conditions whereby a specific hybridization complex is formed between said probe and a target polynucleotide in the biological sample, said target polynucleotide comprising a polynucleotide sequence of a polynucleotide of claim 12 or fragment thereof, c) quantifying the amount of hybridization complex, and d) comparing the amount of hybridization complex in the treated biological sample with the amount of hybridization complex in an untreated biological sample, wherein a difference in the amount of hybridization complex in the treated biological sample is indicative of toxicity of the test compound. 30-45. (Canceled)
 46. A microarray wherein at least one element of the microarray is a polynucleotide of claim
 13. 47-101. (Canceled) 